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Abstract 


A  key  problem  in  model-based  object  recognition  is  selection,  nennely,  the  problem  of  isolating  Regions 
in  an  image  that  are  likely  to  come  from  a  single  object.  This  isolation  can  be  either  based  solely  on 
image  data  (data-driven)  or  can  incorporate  the  knowledge  of  the  model  object  (model-driven).  In  this 
paper  we  present  an  approach  that  exploits  the  property  of  closely-spaced  parallelism  between  lines  on 
objects  to  achieve  data  and  model-driven  selection.  Specifically,  we  present  a  method  of  identifying 
groups  of  closely-spaced  parallel  lines  in  images  that  generates  a  linear  number  of  small-sized  and  reliable 
groups  thus  meeting  several  of  the  desirable  requirements  of  a  grouping  scheme  for  recognition.  The  line 
groups  generated  form  the  basis  for  performing  data  and  model-driven  selection.  Data-driven  selection  is 
achieved  by  selecting  salient  line  groups  as  judged  by  a  saliency  measure  that  emphasizes  the  likelihood 
of  the  groups  coming  from  single  objects.  The  approach  to  model-driven  selection,  on  the  other  hand, 
uses  the  description  of  closely-spaced  parallel  line  groups  on  the  model  object  to  selectively  generate  line 
groups  in  the  image  that  are  likely  to  be  the  projections  of  the  model  groups  under  a  set  of  allowable 
transformations  and  taking  into  account  the  effect  of  occlusions,  illumination  changes,  emd  imaging  errors. 
We  then  discuss  the  utility  of  line  groups-based  selection  in  the  context  of  reducing  the  search  involved 
in  recognition,  both  as  an  independent  selection  mechanism,  and  when  used  in  combination  with  other 
cues  such  as  color.  Finally,  we  present  results  that  indicate  a  vast  improvement  in  the  performance  of  a 
recognition  system  that  is  integrated  with  parallel  line  groups-based  selection. 
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1  Introduction 

A  key  problem  in  object  recognition  is  selection,  namely, 
the  problem  of  identifying  regions  in  an  image  within 
which  to  start  the  recognition  process,  ideally  by  isolat¬ 
ing  regions  in  an  image  that  are  likely  to  come  from  a 
single  object.  Model-based  object  recognition  methods 
that  try  to  recognize  which  members  of  their  library  of 
models  are  present  in  the  scene,  usually  use  geometric 
features  such  as  points  or  edges  and  try  to  identify  pair¬ 
ings  between  data  and  model  features  that  are  consistent 
with  a  rigid  transformation  of  the  object  model  into  im¬ 
age  coordinates.  The  large  number  of  such  pairings  that 
need  to  be  examined  in  cluttered  scenes  leads  to  a  com- 
binatorially  explosive  search  problem.  It  has  been  shown 
that  this  search  can  be  considerably  reduced  if  recogni¬ 
tion  systems  are  equipped  with  a  selection  stage  where 
subsets  of  data  features  can  be  isolated  that  are  likely 
to  come  from  a  single  object,  thus  allowing  the  search 
to  be  focused  on  those  matches  that  are  more  likely  to 
lead  to  a  correct  solution  [9].  This  isolation  can  be  either 
based  solely  on  image  data  (data-driven)  or  can  incorpo¬ 
rate  the  knowledge  of  the  model  object  (model-driven). 
Even  though  selection  can  be  of  help  in  recognition,  it 
has  been  found  difficult  to  achieve  in  practice.  What 
makes  selection  so  difficult?  In  the  ideal  case,  if  the  ap¬ 
pearance  of  the  desired  object  in  the  scene  were  known, 
and  objects  in  the  scene  were  nicely  separated  and  dis¬ 
tinguishable  from  the  background,  and  the  illumination 
conditions  were  known,  then  even  simple  methods  that 
rely  on  intensity  measurements  would  work  well  to  ex¬ 
tract  groups  of  features.  But  in  reality,  the  appearance 
of  the  object  is  not  known.  In  addition,  illumination 
conditions  and  surface  geometries  of  objects  present  in  a 
scene  can  cause  problems  of  occlusion,  shadowing,  spec- 
ularities,  and  inter- reflections  in  the  image  and  make 
it  difficult  to  interpret  groups  of  data  features  such  as 
edges  and  lines  as  belonging  to  a  single  object.  Previ¬ 
ous  approaches  have  mainly  considered  data-driven  se¬ 
lection  by  treating  as  a  problem  of  grouping  data  fea¬ 
tures  based  on  some  constraint  such  as  parallelism,  or 
collinearity,  [17],  distance  and  orientation  [13],  cind  re¬ 
gions  enclosed  by  a  group  of  edges  [3]  to  capture  some 
meaningful  structure  in  a  scene.  Grouping  was  intro¬ 
duced  as  a  technique  of  reducing  the  search  involved  in 
recognition  by  dividing  the  search  for  matching  features 
into  a  search  for  a  matching  pair  of  groups  followed  by 
a  search  for  corresponding  features  within  the  matching 
groups.  To  effectively  reduce  the  search  in  recognition, 
however,  a  grouping  scheme  must  produce  a  smaJl  num¬ 
ber  of  small-sized  groups  that  are  reliable  (i.e.  span  a  sin¬ 
gle  object)  [5].  Existing  grouping  schemes  usually  meet 
some  but  not  all  of  these  requirements.  For  example, 
schemes  that  capture  only  me2mingful  structure  with¬ 
out  reasoning  about  scene  geometry  can  produce  groups 
that  are  unreliable  such  as  the  grouping  of  salient  image 
contours  described  in  [25].  In  other  grouping  schemes, 
the  weakness  of  the  constraints  used  can  lead  to  a  leirge 
(potentially  exponential)  number  of  groups  [13].  Other 
schemes  that  restrict  the  number  of  groups  can  either 
cause  some  relevant  groups  to  be  missed  leading  to  false 
negatives  during  recognition  [11]  or  cause  the  groups  to 


become  unreliable  [3] . 

So  the  general  problem  of  selection  remains  largely 
unsolved  as  it  is  still  not  obvious  how  to  reliably  char¬ 
acterize  subsets  of  data  features  that  will  give  clues  that 
point  to  a  single  object.  We  have  been  involved  in  devel¬ 
oping  a  computational  model  of  selection  that  proposes 
that  selection  can  be  achieved  via  am  attention  mecha¬ 
nism.  Specifically,  it  is  tin  attempt  to  build  a  computa¬ 
tional  model  of  visual  attentional  selection  in  humans, 
and  to  propose  it  as  a  selection  mechanism  for  recogni¬ 
tion.  Towards  this  end,  two  modes  of  humtin  attentional 
behavior,  namely  attracted-attention  tind  pay-attention 
modes,  have  been  isolated  to  serve  as  paradigms  for 
data-driven  and  model-driven  selection  respectively.  The 
aUracted-aiteniion  mode  of  behavior  is  spontaneous  and 
is  commonly  exhibited  by  am  unbiased  observer  (i.e., 
with  no  a  priori  intentions)  when  some  objects  or  some 
aspects  of  the  scene  attract  his/her  attention.  The  pay- 
atteniion  mode  is  a  more  deliberate  behavior  exhibited 
by  an  observer  looking  at  a  scene  with  a  priori  goals 
(such  as  the  task  of  recognizing  an  object,  say)  amd  hence 
paying  attention  to  only  those  objects/aspects  of  a  scene 
that  are  relevant  to  the  goal.  According  to  this  model, 
therefore,  data-driven  selection  can  be  achieved  by  iden¬ 
tifying  regions  in  an  image  that  attract  attention  (i.e., 
that  are  distinctive)  with  respect  to  some  feature  such 
as  color  or  texture,  while  model-driven  selejction  can  be 
achieved  by  paying  attention  to  the  model  features  (i.e., 
using  the  model  features  to  decide  saliency  of  features 
in  the  image).  While  it  is  understandable  that  pay¬ 
ing  attention  to  model  features  can  help  isolate  areas 
in  the  image  that  could  contain  subsets  of  data  features 
that  are  likely  to  contain  a  single  object  (or  the  specific 
model  object  in  this  case),  it  is  not  immediately  appar¬ 
ent  how  locating  salient  regions  is  an  appropriate  way  of 
performing  data-driven  selection.  Such  a  choice  is,  how¬ 
ever,  motivated  by  the  observation  that  an  object  often 
stands  out  in  a  scene  because  of  some  distinctive  features 
that  are  usually  localized  to  some  portion  of  the  object. 
Therefore  isolating  distinctive  regions  is  more  likely  to 
point  to  a  single  object,  making  such  regions  an  appro¬ 
priate  choice  in  the  absence  of  any  specific  information 
about  the  model  object.  A  number  of  other  approaches 
have  Eilso  suggested  that  selection,  at  least  data-driven, 
can  be  performed  based  on  some  measure  of  saliency  [24]. 

The  above  discussion  indicates  a  framework  for 
achieving  data  and  model-driven  selection.  But  how  can 
salient  regions  be  found  for  data-driven  selection,  and 
how  can  the  object  model  affect  the  choice  of  salient  re¬ 
gions  for  model-driven  selection?  In  earlier  work  we  had 
presented  methods  of  selection  based  on  color  [27,  28] 
and  texture  [29].  There  it  was  shown  that  the  number 
of  data  features  can  be  greatly  reduced  by  such  a  selec¬ 
tion.  But  since  the  regions  isolated  were  rather  large, 
the  groups  contained  a  large  number  of  features  causing 
the  number  of  matches  between  model  and  image  fea¬ 
tures  to  be  still  considerably  large.  If  features  within 
such  regions  could  be  grouped  further  such  that  only  a 
small  number  of  features  fail  into  a  group,  then  by  find¬ 
ing  a  correspondence  between  such  small  region  groups 
on  the  model  and  image,  the  total  number  of  combina- 


tions  of  model  and  image  features  can  be  greatly  reduced. 
In  this  paper,  we  explore  the  use  of  a  property  called 
closely-spaced  parallelism  that  is  often  exhibited  by  lines 
on  objects,  to  perform  data  and  model-driven  selection. 
Specifically,  we  show  how  small-sized  and  reliable  groups 
can  be  generated  using  this  constraint  of  closely-spaced 
parallelism  lines  and  how  such  groups  can  be  used  to 
perform  data  and  model-driven  selection.  Even  though 
grouping  is  still  at  the  heart  of  a  such  a  selection  mecha¬ 
nism,  we  show  that  the  use  of  closely-spaced  parallelism 
as  a  constraint  causes  it  to  meet  several  of  the  desir¬ 
able  requirements  for  recognition,  albeit  at  the  expense 
of  an  increetse  in  the  worst-case  search  complexity  over 
conventional  grouping. 

The  rest  of  the  paper  is  organized  as  follows.  We 
first  discuss  the  need  for  grouping  data  features  into 
small-sized  groups  for  the  purposes  of  recognition.  This 
gives  us  a  set  of  requirements  that  must  be  met  by  any 
scheme  for  grouping  data  features.  We  then  briefly  re¬ 
view  the  existing  grouping  methods  in  the  light  of  these 
requirements.  Next,  we  present  a  method  for  group¬ 
ing  line  features  that  exploits  the  property  of  closely- 
spaced  parallelism  among  lines.  We  then  explore  the  use 
of  line  groups  as  a  feature  by  itself  for  performing  data 
and  model-driven  selection.  In  keeping  with  the  general 
paradigm  of  attentional  selection  presented  above,  data- 
driven  selection  is  achieved  by  selecting  some  salient  line 
groups,  while  model-driven  selection  is  achieved  by  uti¬ 
lizing  the  description  of  line  groups  on  the  model  object. 
Next,  we  show  how  such  line  groups-based  selection  can 
be  combined  with  other  methods  of  selection  based  on 
cues  such  as  color,  to  further  reduce  the  search  in  recog¬ 
nition.  Lastly,  we  present  results  that  indicate  the  ac¬ 
tual  improvement  in  performance  of  a  recognition  system 
that  uses  line  groups-based  selection. 

2  Role  of  Grouping  in  Model-based 
Recognition 

Region  selection  using  color  and  texture  as  described  in 
earlier  work  [27,  28,  29]  reduced  the  search  involved  in 
recognition  by  removing  a  large  number  of  data  features 
from  consideration.  Even  so,  once  a  set  of  regions  is 
selected,  a  iMge  number  of  matches  between  features 
in  corresponding  model  emd  image  regions  may  have  to 
be  tried.  Using  the  alignment  method  for  recognition 
(in  particuleir,  the  linear  combination  of  views  version  of 
this  method  [31])  we  know  that  at  least  four  matching 
features  must  be  found  for  eilignment  (jind  hence  recog¬ 
nition).  If  there  are  M  features  (say,  points)  in  a  model 
region  2ind  N  features  in  the  corresponding  image  re¬ 
gion,  then  O(M^N^)  matches  per  pair  of  corresponding 
regions  may  have  to  be  tried,  in  the  worst  case,  with 
such  region  selection.  For  the  typical  number  of  fea¬ 
tures  (Af  «  100,  N  «  300)  found  in  color  or  texture 
regions,  this  is  still  a  very  large  number  of  matches  to 
be  tried  (  w  10^®).  If  the  data  features  within  such 
regions  can  be  further  grouped  into  some  meaningful 
structures  or  groups  consisting  of  a  small  number  of  fea¬ 
tures  each,  then  the  search  can  be  reduced  by  peiiring 
such  groups,  and  trying  combinations  of  features  within  , 


matching  groups,  as  before.  Previous  research  has  ex¬ 
plored  the  role  of  grouping  in  recognition  for  reducing 
the  search  in  precisely  this  fashion  [13,  3,  17,  9],  To  see 
how  grouping  of  features  can  reduce  the  combinatorics 
of  search  drastically,  we  reproduce  here  the  analysis  of 
grouping  given  in  earlier  work  [13,  3]. 

Let  us  consider  the  case  of  grouping  being  performed 
both  on  the  model  and  image  features.  Let  Mg  and  Ng 
be  the  number  of  model  and  image  groups  respectively, 
and  let  m,-  and  rij  be  the  number  of  features  in  the  model 
group  i  and  image  group  j.  If  the  size  of  the  model 
and  image  groups  are  identical,  and  each  group  contains 
features  coming  from  a  single  object,  then  the  number 
of  matches  that  need  to  be  tried  are  0(^^j 
since  all  pairs  of  model  and  image  groups  may  have  to 
be  tried  and  m,  !  accounts  for  till  permutations  of  fea¬ 
ture  matches  within  a  pair  of  matching  groups.  Further, 
if  the  features  in  a  group  can  be  linearly  ordered,  the 
number  of  matches  reduces  to  0(^]^i 
number  of  features  in  the  model  and  image  groups  are 
not  identical  or  if  not  all  the  features  in  groups  come 
from  a  single  object,  then  assuming  at  least  one  image 
group  contains  at  least  4  features  of  a  model  group,  a  so¬ 
lution  for  the  pose  of  the  model  object  can  be  obtained 
by  trying,  in  the  worst  case,  all  matches  of  four  features 
within  each  pair  of  image  and  model  groups.  The  num¬ 
ber  of  matches  that  need  to  be  tried  in  such-oetse  becomes 
0(Ei’iE&m?n,^4!). 

For  small-sized  groups  (say,  about  5  features  each),  this 

is  essentially  0{MgNg)  or  linear  in  the  number  of  groups 
1 

From  the  above  analysis,  we  see  that  in  order  to 
reduce  the  search  involved  in  recognition,  a  grouping 
scheme  must  possess  some  desirable  properties.  Ideally, 
a  grouping  method  must  produce  highly  reliable  (that  is, 
groups  coming  from  a  single  object)  equal-sized  groups 
in  the  model  and  image.  If  this  is  not  possible,  it  must 
contzdn  at  least  a  suflBcient  number  of  features  (four  be¬ 
ing  the  minimum)  coming  from  a  single  object  to  make 
recognition  possible.  When  groups  satisfy  this  “mini¬ 
mum  reliability” ,  the  number  of  extraneous  features  in 
a  group  must  be  as  small  as  possible.  In  other  words,  it 
is  desirable  to  have  small-sized  groups,  so  that  the  com¬ 
plexity  of  seMch  remains  linear  in  the  number  of  groups. 
Another  requirement  to  keep  the  number  of  matches 
small  is  to  lower  the  number  of  possible  groups  (to  a  low- 
order  polynomial).  The  number  of  groups  cannot,  how¬ 
ever,  be  reduced  by  arbitrarily  discarding  groups  as  this 
could  create  unnecessary  false  negatives  during  recogni¬ 
tion.  That  is,  it  may  cause  an  object  to  be  not  recognized 
because  groups  corresponding  to  the  model  groups  were 
discarded.  Finally,  since  grouping  is  a  pre-processing 
step  to  recognition,  the  group  generation  process  (i.e., 
the  algorithm  for  assembling  the  groups)  itself  must  be 
fast  amd  simple. 

The  above  discussion  suggests  that  one  of  the  keys  to 
making  grouping  useful  for  recognition  is  to  group  fea- 

*This  ignores  the  effort  required  for  verifying  a  match 
assuming  it  is  the  same  for  recognition  with  or  without 
grouping. 


tures  in  an  image  based  on  constraints  that  capture  some 
salient  and  easily  detectable  structures  that  point  in  turn 
to  meaningful  structures  on  objects  in  scenes.  In  this 
way,  the  number  and  size  of  groups  can  be  kept  small, 
since  not  all  tuples  of  features  will  be  meaningful,  and 
being  easily  detectable,  the  groups  can  be  generated  by 
a  fast  and  efficient  algorithm.  Finally,  since  such  groups 
point  to  meaningful  structures  on  objects  in  scenes,  they 
tend  to  be  more  reliable. 

2.1  Approaches  to  grouping 

We  now  examine  some  of  the  previous  work  on  group¬ 
ing  in  vision  in  the  light  of  the  above  requirements  for 
their  use  in  recognition.  We  will  focus  here  on  group¬ 
ing  of  edge  features,  remarking  on  grouping  methods  for 
other  data  features  only  briefly.  More  extensive  reviews 
of  grouping  are  available  elsewhere  in  literature  [13],  [17]. 

Grouping  was  initially  studied  in  psychology,  mainly 
as  a  perceptual  phenomenon.  There  it  was  noticed  that 
when  we  look  at  an  edge  image  of  a  scene,  we  often  pick 
up  any  structural  information  present  in  a  collection  of 
edges  or  lines.  Figure  1  illustrates  this  with  examples 
of  line  arrangements  in  which  we  can  identify  some  per¬ 
ceptual  structure.  Early  Gestalt  psychologists  demon¬ 
strated  through  a  variety  of  examples  that  humans  use 
cues  such  as  simplicity,  proximity,  similarity,  symmetry 
and  familiarity  for  grouping  features  [33].  Their  expla¬ 
nation  for  the  perception  of  groups  based  on  such  cues 
seemed  plausible  but  lacked  a  quantitative  basis  due  to 
the  difficulty  in  precisely  defining  concepts  such  as  sim¬ 
plicity  and  familiarity.  Later  studies  tried  to  make  terms 
such  as  simplicity  a  little  more  concrete  by  using  the  con¬ 
cept  of  minimum  entropy  from  information  theory  [10]. 
The  explanations  put  forward  by  psychologists  about 
this  ability  to  group  a  collection  of  features  based  on 
constraints  all  seem  to  imply  that  it  reflects  an  underly¬ 
ing  knowledge  of  what  makes  a  collection  of  edges  come 
from  a  single  object.  In  other  words,  the  grouping  pro¬ 
cess  reflects  an  inherent  bias  towards  collecting  those 
edges  that  are  likely  to  belong  to  a  single  object. 

While  the  work  on  grouping  in  psychology  had  con¬ 
centrated  on  observing  it  as  a  phenomenon  and  develop¬ 
ing  explanations  for  it,  the  work  on  grouping  in  com¬ 
puter  vision  has  focused  more  on  ways  of  making  it 
useful  for  computer  vision.  Towjirds  this  end,  several 
roles  of  grouping  have  been  envisaged.  In  the  early 
work  of  Marr,  for  example,  grouping  was  suggested  as  a 
way  of  abstracting  information  in  the  raw  primal  sketch 
derived  from  the  image  [18].  He  suggested  grouping 
based  on  constraints  such  as  curvilinearity,  parallelism, 
and  collinear  displacements.  Later  work  developed  tech¬ 
niques  to  perform  grouping  such  as  the  use  of  the  Hough 
transform  to  capture  collinearity  information  in  points 
[8].  Grouping  was  also  suggested  as  a  useful  step  in  both 
geometric  zmd  symbolic  methods  of  recognition.  Lowe 
proposed  grouping  as  a  way  of  establishing  good  prim¬ 
itives  for  recognition  [17].  Jacobs  emd  Clemens  showed 
the  extent  of  search  reduction  possible  using  grouping 
as  a  pre-processing  step  in  recognition  [5].  The  role  of 
grouping  in  geometric  methods  of  recognition  was  merely 
to  organize  the  data  features,  while  the  actual  recogni¬ 


tion  was  done  by  using  the  data  features  from  the  groups. 
The  role  of  grouping  in  symbolic  recognition  methods,  on 
the  other  hand,  was  to  provide  the  groups  themselves  as 
high-level  match  primitives  to  be  used  directly  for  recog¬ 
nition  using  symbolic  reasoning  techniques.  Early  vision 
systems  used  grouping  in  this  sense,  such  as  ACRONYM 
in  which  edges  were  grouped  into  ribbons  and  the  recog¬ 
nition  of  objects  proceeded  based  on  the  ribbons  and 
their  topology  [2].  More  recently,  grouping  has  been 
used  for  purposes  of  indexing  into  a  library  of  objects 
in  geometric  methods  of  recognition  [4],  [15].  It  has  also 
been  used  for  this  purpose  in  symbolic  methods  of  recog¬ 
nition  to  extract  meaningful  structures  in  scenes  based 
on  constraints  of  parallel  and  skew  symmetry  [26],  [19] 
and  proximity [7]. 

The  role  of  grouping  in  extracting  meaningful  struc¬ 
tures  in  scenes  has  also  been  emphasized  in  grouping 
schemes  based  on  region  and  contour  features  (rather 
than  edge  or  point  features).  This  can  be  seen  in  the 
work  of  Shashua  and  Ullman  on  the  grouping  of  im¬ 
age  contours  to  capture  siilient  curves  [25],  of  Dolan  and 
Weiss  on  the  grouping  of  curved  lines  using  proximity 
[7],  and  of  LeClerc  on  hierarchical  grouping  of  regions 
based  on  the  minimum  description  length  principle  [16]. 

A  class  of  approaches  in  computer  vision  have  at¬ 
tributed  the  tendency  to  group  features  tp  the  ability 
of  humans  to  recognize  the  non-accidental  occurrence  of 
the  relation  underlying  the  groups.  That  is,  the  degree 
to  which  a  relation  is  unlikely  to  have  arisen  by  an  acci¬ 
dent  of  viewpoint,  rather  than  the  knowledge  that  they 
belong  to  a  single  object,  is  the  motivation  behind  group¬ 
ing  features  based  on  that  relation.  This  was  concluded 
by  Witkin  and  Tenenbaum  [34]  after  observing  that  such 
non-accidentalness  was  used  to  interpret  groups  of  fea¬ 
tures  even  when  the  ultimate  interpretation  of  the  groups 
was  not  known.  They  also  pointed  out  that  since  such  a 
relation  was  expected  to  remain  stable  over  a  large  num¬ 
ber  of  viewpoints,  it  must  reflect  some  meeiningful  struc¬ 
ture  in  the  scene.  Lowe  extended  the  non-accidentalness 
argument  behind  grouping  to  identify  the  set  of  image 
relations  that  are  unlikely  to  occur  by  an  accident  of 
viewpoint  [17].  Using  the  assumption  that  the  viewpoint 
of  the  camera  is  independent  of  the  objects  in  the  scene, 
he  showed  that  only  certmn  image  relations,  such  as  con¬ 
vexity  and  parallelism,  are  likely  to  remain  stable  ovjsr 
a  large  range  of  viewpoints.  He  eilso  concluded  that  be¬ 
cause  of  this  viewpoint  invariance,  the  detection  of  such 
relations  in  an  image  implied  that  they  were  likely  to 
be  the  projection  of  a  meaningful  zuid  specific  3d  struc¬ 
ture.  Lowe  showed  that  this  property  can  make  such 
groups  useful  for  the  recognition  of  three  dimensional 
objects.  For  example,  using  the  non-accidentalness  of 
viewpoint,  he  showed  that  parallel  lines  in  the  image  are 
most  likely  to  come  from  pairallel  lines  in  space.  So  if  the 
model  object  contained  p2urallel  lines,  then  this  justifies 
the  matching  of  (groups  of)  parallel  lines  in  the  image 
to  (groups  of)  parallel  lines  on  the  model,  thus  making 
such  groups  useful  primitives  for  recognition. 

Another  class  of  approaches  in  computer  vision  ex¬ 
plored  the  same  argument  for  grouping  that  was  put 
forward  by  psychologists  based  on  the  likelihood  of  the 


Figure  1:  Illustration  of  perceptual  structure  apparent  in  line  arrangements,  (a)  A  group  of  lines  perceived  as  a 
collection  of  squares  due  to  closure  (or  continuation),  (b)  Lines  seen  as  crossing  due  to  good  continuation,  (c) 
Bilateral  symmetry  evident  from  the  group  of  lines  shown.  (Adapted  from  Figure  2-1  of  [17]) 
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grouped  features  coming  from  a  single  object.  For  exam¬ 
ple,  in  the  early  work  of  Roberts,  vertices  connected  by 
straight  edges  were  grouped  using  the  rationale  that  con¬ 
nected  vertices  were  likely  to  be  part  of  the  same  object 
[22].  This  argument  was  given  a  more  quantitative  basis 
by  Jacobs  who  proposed  that  grouping  of  data  features 
in  an  image  should  be  based  on  a  relation  that  points  to 
the  likelihood  of  such  features  coming  from  a  single  ob¬ 
ject  [13].  Specifically,  he  used  distance  and  orientation 
constraints  to  explore  the  convexity  relation  between  a 
group  of  edges.  By  doing  a  statistical  analysis  of  occlu¬ 
sions  and  merging  of  edges  with  background  in  images, 
he  showed  that  it  is  unlikely  for  a  randomly  selected 
group  of  edges  to  form  a  convex  polygon,  thereby  imply¬ 
ing  that  the  detection  of  such  a  relation  between  edges 
pointed  to  their  likelihood  of  coming  from  single  objects. 
Other  researchers  that  have  used  a  similar  argument  for 
grouping  eire  Bolles  and  Cain  who  used  the  proximity 
relation  to  groups  features  in  their  local  feature  focus 
method  [1],  Brooks  who  grouped  edges  forming  ribbons 
or  trapezoids  [2],  and  Clemens  who  grouped  edges  en¬ 
closing  open  regions  [3]. 

Let  us  now  evaluate  some  of  the  existing  schemes 
for  grouping  from  the  point  of  recognition.  Grouping 
schemes,  such  as  that  of  Shashua  and  Ullman  [25]  for 
grouping  image  contours,  that  attempt  to  capture  mean¬ 
ingful  structures  in  scenes  without  reasoning  about  scene 
geometry,  often  produce  groups  that  span  wide  areas 
of  the  image,  making  them  unreliable  for  recognition. 
Other  grouping  schemes  based  on  viewpoint  invariance 
that  use  constraints  or  relations  that  are  likely  to  hold 
over  a  wide  range  of  viewpoints  such  as  parallelism  [17], 
or  convexity[13],  [11]  also  produce  groups  that  attempt 
to  capture  meaningful  structures  in  the  scene.  However, 
the  ease  with  which  such  relations  are  detected  in  im¬ 
ages  decides  the  number  of  groups  generated  as  well  as 
their  size.  Since  several  nearby  edges  could  satisfy  re¬ 
lations  such  as  parallelism  and  convexity,  different  com¬ 
bination  of  edges  have  to  be  explored  by  grouping  algo¬ 
rithms  leading  to  a  very  large  number  of  groups  and  tak¬ 
ing  time  of  equd  complexity.  For  example,  Huttenlocher 
grouped  edges  based  on  connectivity  by  considering  ^dI 
possible  sequences  of  edges  of  length  three  (leading  to 
O(N^)  groups  for  N  edges)[12].  Later  work  on  grouping 
tried  to  generate  a  smaller  number  of  groups  by  filtering 
some  of  the  groups.  In  Huttenlocher  and  Wayner  [11] 
for  example,  a  grouping  algorithm  was  presented  that 
works  in  0(n  logn)  time  emd  generates  a  lineeir  number 
of  convex  edge  groups.  The  filtering  was  done  by  using  a 
cost  function  to  rank  neighbors  of  ein  edge  and  allowing 
only  the  least-cost  neighbor  to  participate  in  a  convexity 
relation.  Although  the  number  of  groups  are  restricted 
by  this  method,  it  is  not  clear  whether  such  decisions 
can  be  made  on  a  purely  loc2d  basis.  Also,  since  there  is 
no  analysis  qf  the  kind  of  groups  that  will  be  missed,  it 
is  not  clear  that  such  groups  do  not  cause  a  recognition 
system  to  make  false  negative  identifications.  Another 
grouping  scheme  explored  by  Clemens  also  restricts  the 
number  of  groups  to  be  linear  in  the  number  of  edges 
[3].  Here  the  groups  are  designated  by  open  regions  that 
are  enclosed  by  a  group  of  edges.  Such  open  regions  of 


the  image  were  considered  likely  to  come  from  a  single 
object  because  a  transition  from  one  object  to  einother 
almost  always  caused  a  change  in  intensity  sufficient  to 
produce  an  edge  that  splits  a  region.  The  grouping  al¬ 
gorithm  used  assigns  an  edge  to  at  most  four  regions 
thus  ensuring  that  the  number  of  groups  remains  lin¬ 
ear  in  the  number  of  edges.  Due  to  feature  instabilities, 
imaging  artifacts,  etc.  an  open  region  is  rarely  bounded 
by  edges  forming  a  complete  connected  closure,  causing 
such  assignments  of  edges  to  4  neighbors  using  purely 
local  judgment  to  group  together  edges  that  do  not  nec¬ 
essarily  come  from  the  single  object.  Thus  in  the  existing 
approaches  to  grouping,  it  appears  that  restricting  the 
number  of  groups  may  either  cause  some  relevant  groups 
to  be  missed  or  may  make  the  grouping  scheme  unreli¬ 
able,  causing  it  to  group  features  that  don’t  necessarily 
come  from  a  single  object. 

3  Grouping  Based  on  Closely-Spaced 
Parallelism 

We  now  present  a  grouping  method  that  exploits  the 
relation  of  closely-spaced  parallelism  commonly  occur¬ 
ring  between  lines  on  objects,  to  produce  groups  that 
possess  memy  of  the  desirable  properties  for  purposes  of 
recognition.  Meiny  commonly  occurring  objects  in  in¬ 
door  scenes  such  as  books,  cups  or  tables  possess  some 
pattern-like  structures  that  often  attract  our  attention. 
Such  structures  usually  contain  groups  of  closely-spaced 
parallel  lines  of  a  few  orientations.  For  example,  printed 
letters  on  the  surface  of  an  object  such  as  a  book,  or  a 
bottle,  and  wooden  texture  on  pieces  of  furniture  such 
as  a  table  contain  groups  of  closely  spaced  parallel  lines. 
Sometimes  such  parallel  lines  form  texture-like  patterns 
as  on  the  bottle  in  Figure  2a,  while  in  other  cases  they 
capture  some  interesting  structures  from  parts  of  objects 
such  as  the  parallel  contours  in  the  triangular  block  of 
Figure  2a.  Even  when  they  can  be  treated  as  textures 
we  consider  them  as  a  separate  cue  since  the  property  of 
parallelism  they  capture  is  of  direct  use  in  recognition  as 
a  grouping  method^ 

The  groups  of  psirallel  lines  we  want  to  capture  include 
cases  of  both  explicit  and  implicit  parallelism.  Figure  2a 
shows  a  scene  containing  objects  showing  instances  of 
both  types  of  pareillelism.  The  contour  of  the  triangu¬ 
lar  block  has  two  explicitly  parallel  lines  as  can  be  seen 
from  Figure  2b,  while  the  letter  texture  on  the  bottle 
has  implicit  parallelism  cis  can  be  seen  from  the  group 
of  parallel  lines  in  Figure  2c  where  only  the  nearly  hori¬ 
zontal  lines  of  Figure  2b  are  highlighted.  As  we  will  see 
later,  the  projection  of  such  patterns  in  images  continue 
to  show  closely-spaced  parallelism  among  the  projected 
lines  over  a  wide  range  of  viewpoints.  This  makes  it  pos¬ 
sible  to  capture  closely-spaced  parallelism  on  objects  by 

*Also,  in  the  context  of  building  the  attentional  selection 
model,  while  color  and  texture  have  been  primary  features 
being  extracted  directly  from  the  intensity  image,  parallel¬ 
line  groups  serve  as  a  secondary  feature  being  extracted  from 
the  edges  or  line  features.  So  treating  them  as  a  separate 
cue  illustrates  an  implementation  of  the  model  of  attentional 
selection  possessing  a  feature  hierarchy. 


Figure  2:  Illustration  of  implicit  and  explicit  closely-spaced  parallelism  on  objects,  (a)  An  image  of  a  scene  containing 
objects  showing  explicit  and  implicit  parallelism,  (b)  Line  segment  image  of  (a).  Note  the  parallelism  explicit  in  the 
contour  of  the  triangular  block,  (c)  An  image  showing  only  the  nearly  horizontal  lines  in  the  image  of  (b).  Note  that 
the  parallelism  implicit  in  the  letter  texture  on  the  bottle  in  (b)  becomes  explicit  in  this  image. 
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examining  such  a  relation  between  the  edges  (or  lines) 
in  their  projections.  Since  it  is  rare  that  adjacent  ob¬ 
ject  regions  in  an  image  possess  similarly-spaced  paral¬ 
lel  lines  of  similar  orientation,  the  detection  of  closely- 
spaced  parallel  lines  in  images  is  edso  likely  to  point  to 
single  objects.  Further,  since  not  all  edges  in  the  image 
are  likely  to  show  closely-spaced  parallelism,  this  could 
lead  automatic2dly  to  fewer  groups.  And  for  objects 
showing  characteristic  textural  information  that  contains 
such  closely-spaced  parallel-lines,  the  groups  can  be  use¬ 
ful  clues  that  point  to  the  identity  of  such  objects.  Also, 
when  the  spacing  allowed  between  parallel  lines  is  small, 
such  groups  capture  compact  areas  in  both  the  image 
and  the  object  and  contain  fewer  features  in  a  group. 
Finally,  as  we  will  see  later,  such  groups  can  be  eas¬ 
ily  found  in  images  using  a  simple  algorithm.  Thus 
groups  of  closely-spaced  paj2illel  lines  in  images  capture 
not  only  meaningful  structures  on  objects  in  scenes  but 
also  possess  the  desirable  properties  required  of  a  group¬ 
ing  scheme  for  recognition. 

Grouping  based  on  such  parallelism,  however,  has  the 
disadvantage  that  unlike  in  conventional  grouping,  a  sin¬ 
gle  pair  of  matching  groups  is  not  sufficient  for  recogni¬ 
tion.  This  is  because  recognition  methods  such  as  the 
linear  combination  of  views-based  alignment  method  re¬ 
quire  at  least  4  non-coplanar  points  for  alignment.  Since 
a  group  of  parallel  lines  in  space  span  a  plane,  the  fea¬ 
tures  such  as  points  or  lines  derived  from  them  are  copla- 
nar,  needing  at  least  two  matching  pairs  of  groups  to 
be  found.  However,  since  more  than  four  corresponding 
features  are  needed  in  practice,  other  grouping  schemes 
have  also  found  the  need  for  finding  more  than  a  pair  of 
matching  groups  [13]. 


parallelism  between  them  actually  appear  as  groups  of 
closely-spaced  lines  in  image  that  are  almost  parallel  (i.e. 
with  slight  inter-line  as  well  as  overall  skew).  However, 
we  will  refer  to  such  groups  in  both  the  image  and  ob¬ 
ject  as  closely-spaced  paraJlel-line  groups  (or  in  short  as 
line  groups)  with  the  implication  of  strict  parallelism  be¬ 
tween  lines  in  3D  and  approximate  parallelism  between 
their  projections. 

To  precisely  define  such  groups  in  an  image,  we  begin 
with  some  terminology  relating  to  2d  non-intersecting 
line  segments. 

3.1.1  Terminology 

1.  Overlapping  lines:  Two  line  segments  are  said  to  over- 
lap  if  the  projection  of  at  least  one  end  point  of  one  of 
the  lines  lies  inside  the  other  line  segment.  Figure  3a 
shows  examples  of  overlapping  and  non-overlapping  line 
segments. 


2.  Across-the-line-distance:  The  across-the-line  distance 
between  two  lines  £2  whose  end  points  are 

designated  by  P)ii,pji2,Pi2i,P/22  respectively,  is  de¬ 
fined  as  follows.  Let  S  denote  the  set  of  pairs 
{ (Pa  1 .  P/2 1 ) ,  (pii  1 ,  PJ22 ) .  (P/i  2 ,  P(2 1 ) ,  (p/12 ,  P(22  ) }  •  Let 

drain  =  min  {d(p,-,Pj)|V(p,-,p;)  €  5},  whe^e  d(p<,Pj)  is 
the  euclidean  distance  between  the  points  of  the  pair 
(PitPj)-  Let  (pr,pj)  be  the  pair  in  S  that  has  this  mini¬ 
mum  distance  dm«n-  Let  £(pr)  and  £(p»)  be  the  lengths 
of  the  projection  of  points  pr  and  p,  onto  lines  £2  and 
Li  respectively.  Then  the  across-the-line  distance  dacross 
between  lines  £1  and  £2  is  defined  as 


3.1  Closely-spaced  parallelism  constraint 

So  feir  we  have  only  loosely  specified  the  property  of 
closely-spaced  parallelism  and  have  given  intuitive  ar¬ 
guments  about  the  advantages  of  grouping  based  on  this 
relation.  We  now  make  the  definition  more  precise  to 
allow  the  generation  of  groups  from  line  segments  in  an 
image  based  on  this  relation.  Ideally,  the  structure  in 
space  we  want  to  capture  using  the  closely-spaced  par¬ 
allelism  constraint  is  a  set  of  (3D)  parallel  line  segments 
on  an  object  with  a  given  inter-line  spacing.  To  see  how 
such  a  structure  appears  in  an  image  (i.e.  in  a  projec¬ 
tion),  we  exploit  some  well-known  results  in  descriptive 
geometry  [14].  These  results  indicate  that  under  ortho¬ 
graphic  projection  and  scale  (often  used  to  approximate 
perspective  projection),  a  parallel-line  group  in  3D  al¬ 
ways  projects  to  a  group  of  parallel  lines  in  the  image 
under  any  view.  In  practice,  because  of  the  noise  in  the 
imaging  process,  and  depending  on  the  method  used  to 
obtain  line  segments  from  edges,  such  a  group  appeeirs 
as  a  set  of  closely-spaced  lines  with  slight  skew  between 
the  lines  but  with  the  overall  orientation  of  the  group  re¬ 
maining  more  or  less  uniform.  When  perspective  effects 
are  dominant,  however,  paurallel  lines  in  3D  appeau'  as  a 
set  of  converging  lines.  For  most  imaging  distances,  this 
convergence  is  slight,  so  that  such  lines  have  only  a  small 
amount  of  inter-line  (as  well  as  overall)  skew.  Thus  3D 
line  segments  on  objects  showing  strict  closely-spaced 


j  _  /  >^»n{L(pr)) £(p»)}  if  £1  and  £2  are  overlapping 
Oacro..  -  I  otherwise 

(1) 

Figure  3b  shows  examples  of  some  non-intersecting 
line  segments  and  the  across-the-line  distance  between 
them. 

3.  Along-the-line-distemce:  The  along-the-line  distance 
daiong  between  two  lines  £1  and  £2  is  defined  as: 


'{(^Ln-LiPrnVidiin-Lip,)^)} 


(2) 

where  the  terms  dmin,  I'iPr),  JUid  L{p,)  are  as  given  in 
Definition  2.  Figure  3c  shows  some  non-intersecting  line 
segments  and  the  along-the-line  distance  between  them. 


if  £1  a 
otherv 


3.1.2  A  closely-spaced  parallel-line  group 

A  closely-spaced  parallel-line  group  in  the  image, 
specified  by  the  tuple 

^^acrotsjialongt^loeal—orientt^globai—orient 

is  the  iMgest  group  of  non-intersecting  line  segments 
such  that  for  each  line  in  the  group,  there  exists  another 
line  in  the  group  obeying  all  of  the  following  constraints; 

1.  The  across-the-line  distance  daerott  between  the 
lines  is  no  more  than  the  threshold  taero$s- 

2.  The  along-the-line  dist2mce  daiong  between  the  lines 
is  no  more  than  the  threshold  t along  ■ 
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Figure  3:  Illustration  of  some  of  the  terminology  relating  to  2D  non-intersecting  line  segments,  (a)  The  difference 
between  overlapping  (i)  and  non- overlapping  (ii)  line  segments  according  to  Definition-1  in  the  text,  (b)  Across- 
the-line  distance  shown  for  both  overlapping  (i)  and  non-overlapping  (ii)  line  segments,  (c)  Along-the-line  distance 
shown  for  both  overlapping  (i)  and  non-overlapping  (ii)  line  segments. 
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3.  The  orientation  difference  between  the  lines  is  no 
more  than  a  threshold  ^ local ort cnt  • 

Moreover  the  entire  group  must  satisfy  the  condition 
that  the  maximum  orientation  change  between  any 
two  lines  in  the  group  is  no  more  than  a  threshold 

^global— orient  ■ 

The  above  definition  of  closely-spaced  parallelism  al¬ 
lows  for  almost  parallel  lines  to  be  grouped,  which  as 
we  said  above,  is  a  more  useful  structure  to  capture 
in  the  image.  Further,  by  allowing  non-overlapping 
lines  to  be  grouped,  it  can  not  only  capture  groups 
of  non-overlapping  parallel  lines  in  space,  but  also  al¬ 
lows  some  occlusions  that  cover  portions  of  line  seg¬ 
ments,  to  be  handled.  The  constraint  on  global  ori¬ 
entation  change,  tgiobai-orient,  is  imposed  to  keep  the 
entire  group  almost  parallel  since  otherwise  successive 
deviations  in  orientation  between  lines  could  lead  to 
a  group  of  faurly  skewed  lines.  This  also  makes  the 
above  grouping  constraint  different  from  the  one  used 
earlier  for  assembling  groups  of  parallel  lines  in  a  data- 
driven  fashion  [21].  Finally,  the  choice  of  the  thresh¬ 
olds  tacrosA)tafon^,t/oea/— orient  orient  dictates  the 

kind  of  groups  that  will  be  generated.  We  will  discuss 
their  choice  when  using  the  groups  to  perform  data  and 
model-driven  selection. 

3.2  Algorithm  to  generate  the  line  groups 

We  now  present  an  algorithm  to  generate  closely-spaced 
parallel  line  groups  in  an  image  for  a  given  choice  of 
thresholds  taero$s>talong>iloeal— orient  t  ^global-orient- 

It  works  by  first  extracting  line  segments  from  edges  in 
an  edge  image  using  one  of  the  standard  algorithms  for 
line-segment  approximation  [20].  The  resulting  line  seg¬ 
ments  are  used  to  generate  the  groups  as  follows; 

1.  Each  line  segment  is  initially  kept  in  a  separate 
group. 

2.  For  each  line  segment  L,  the  following  operations 
are  done: 

(a)  A  rectangular  neighborhood  about  L  that  is 
^iaeross  in  breadth  and  2{taiong+l)  in  length,  where 
/  is  the  length  of  the  line,  is  scanned,  and  all  lines 
that  either  pass  through  this  neighborhood  or  have 
an  end  point  in  it  are  retained. 

(b)  Among  the  lines  obtained  in  step-2a,  those 
that  satisfy  the  local  orientation  change  constraint 
^loeai— orient  With  L  are  retained. 

(c)  A  new  group  is  formed  by  successively  merging 
the  enclosing  groups  of  lines  obtained  after  step 
2b  with  the  enclosing  group  of  L  taking  caue  to 
see  that  no  enclosing  group  being  added  contains  a 
line  violating  the  tgiobai-orient  constraint  with  the 
currently  created  group. 

3.2.1  Analysis 

The  grouping  edgorithm  performs  steps  1-2  using  the 
union-find  data  structure  to  record  and  update  infor¬ 
mation  about  line  groups  [6].  In  this  data  structure, 
information  is  organized  as  a  forest  of  trees.  The  es¬ 
sential  information  within  a  tree  is  summeirized  in  its 
root.  The  basic  operations  that  can  be  performed  on  this 


data  structure  are  make-set  (a)  that  creates  a  single 
node  tree  with  element  a,  find  (a)  that  finds  the  root  of 
the  tree  containing  o,  and  union(a,b)  that  merges  the 
trees  containing  elements  a  tind  b.  Using  a  technique 
called  merging  by  rank  with  path  compression  [6],  it  is 
known  that  m  operations  of  make-set  take  time  0(m),  of 
find  take  time  0(m)  while  m  union  operations  take  time 
0(mA(m,  n))  where  n  is  the  number  of  elements  in  the 
data  structure,  eind  A{m,  n)  is  the  Ackerman’s  function. 
For  most  values  of  m  and  n,  the  function  A{m,  n)  is  al¬ 
most  constant  so  that  a  single  one  of  these  operations 
can  be  done  in  constant  amortized  time. 

Using  the  union-find  data  structure,  Step-1  requires  n 
make-set  operations  for  n  line  segments.  For  each  line  L, 
Step  2a  requires  all  lines  to  be  scanned  requiring  0(n) 
time.  Similarly  Step  2b  requires  0(n)  time,  in  the  worst 
case,  to  examine  all  the  retained  lines.  If  the  least  orien¬ 
tation  in  a  group  is  stored  as  part  of  the  information  in 
the  roots  of  trees,  then  the  constraint  checking  in  Step  2c 
can  be  done  by  a  simple  find  operation  per  line.  Finally, 
the  merging  in  Step  2c  can  be  done  by  a  union  operation. 
Thus  the  entire  step  2  can  be  done  in  time  0{n)  per  line 
with  the  result  that  the  grouping  algorithm  itself  runs  in 
O(n^)  worst-case  time. 

3.2.2  Results 

We  now  illustrate  the  grouping  algorithin  with  a  few 
examples.  Figure  4a  shows  the  line  segments  obtained 
by  doing  a  line  segment  approximation  to  the  edges  in 
the  image  of  Figure  2a.  The  closely-spaced  parallel  line 
groups  obtained  using  the  grouping  algorithm  with  a 
constraint  specification  of<  10, 5, 6, 10  >  are  shown  in 
Figures  4f-i.  These  groups  are  shown  along  four  ma¬ 
jor  orientations  (vertical,  horizontal,  obtuse,  and  acute) 
for  clarity.  The  individuzd  groups  are  highlighted  by 
drawing  the  convex  hull  of  the  end  points  of  line  seg¬ 
ments.  The  line  segments  that  are  grouped  cein  be  seen 
in  the  corresponding  Figures  4b-e.  Similarly,  Figure  5 
shows  another  example  of  grouping  performed  by  the  al¬ 
gorithm.  By  using  the  edgorithm  on  a  number  of  edge 
images,  the  number  of  groups,  their  average  size  (num¬ 
ber  of  constituent  lines)  and  the  average  area  spanned 
by  the  groups  were  recorded.  The  results  are  shown  in 
Table  1.  From  the  table  it  can  be  seen  that  the  number 
of  groups  is  linear  in  the  number  of  line  segments,  and 
the  size  of  the  line  groups  tends  to  be  small. 


3.2.3  Discussion 

The  number  of  groups  generated  by  the  grouping  al¬ 
gorithm  is  in  fact  linear  in  the  number  of  lines,  since  each 
line  belongs  to  at  most  one  group  at  the  end  of  Step  2. 
If  the  constreiints  did  not  involve  tgUbai- orient,  it  is  clear 
that  only  a  linear  number  of  groups  would  have  been  pos¬ 
sible  (recall  that  we  are  considering  only  the  largest  such 
groups).  With  the  fourth  constraint  tgiobai-orient  added, 
the  starting  line  as  well  as  the  order  in  which  lines  are 
examined  determines  the  lines  that  ultimately  belong  to 


Figure  4:  Illustration  of  grouping  based  on  closely-spaced  parallelism  and  salient  group  detection,  (a)  Line  segments 
to  be  grouped  based  on  closely-spaced  parallelism,  (b)-(e)  Line  segments  shown  along  four  major  orientations,  namely, 
vertical,  horizontal,  obtuse,  and  acute  orientations.  (/)-  (i)  The  line  groups  formed  using  the  algorithm  shown  also 
along  the  respective  major  orientations  for  clarity.  The  thresholds  used  were  tacrost  =  I0,taiong  =  ^,tiocai-oTitnt  - 
6  1 1  global— orient  =  10°.  (j)  The  40  most  salient  groups  among  the  line  groups  of  (f)  -  (i)  found  using  the  sahency 
measure.  Note  that  none  of  the  salient  groups  span  more  than  one  object. 
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Figure  5;  Illustration  of  grouping  based  on  closely-spaced  parallelism  and  salient  group  detection  —  Another  example, 
(a)  Line  segments  to  be  grouped  based  on  closely-spaced  parallelism,  (b)-(e)  Line  segments  shown  along  four  major 
orientations,  namely,  vertical,  horizontal,  obtuse,  and  acute  orientations,  (fj-  (i)  The  line  groups  formed  using 
the  algorithm  shown  also  along  the  respective  major  orientations  for  clarity.  The  thresholds  used  were  taerois  = 
^ttalong  —  ^illoeal— orient  =  ,tgiohai-orient  =  10®.  (j)  The  40  most  Salient  groups  among  the  line  groups  of  (f)  -  (i) 
found  using  the  sgliency  measure.  Note  that  only  two  of  the  salient  groups  span  more  than  one  object. 
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S.No. 

Image 

Size 

Num. 

Lines 

Group 

Constraints 

Num. 

Groups 

Avg.  Group 
Size 

Max.  Group 
Size 

Avg.  Group 
Area 

1. 

320  x  576 

395 

<  10, 5, 6, 10  > 

34 

4 

21 

0.002 

2. 

256  X  416 

756 

<  6,5,6, 10  > 

91 

3.0 

11 

0.001 

3. 

240  X  240 

233 

<  5, 0,6, 10  > 

22 

3.1 

6 

0.0009 

4. 

232  X  576 

884 

^  5, 5, 6, 10  ^ 

119 

3.9 

7 

0.0008 

5. 

200  X  492 

1232 

<  5, 10, 9, 12  > 

243 

3.77 

13 

0.0028 

6. 

224  X  416 

316 

<  5, 5, 6, 10  > 

75 

3 

17 

0.003 

Table  1;  Characteristics  of  closely-spaced  parallel  line  groups  generated  by  the  grouping  algorithm.  The  average  group 
area  is  normalized  with  respect  to  the  image  size.  Only  groups  containing  more  than  one  line  are  considered  here. 


a  group  as  well  as  its  size.  In  such  cases,  more  groups 
than  are  generated  by  the  algorithm  2ire  possible.  A  case 
where  this  happens  is  shown  in  Figure  6.  Figure  6a  shows 
an  arrangement  of  closely-spaced  peirallel  lines  in  the  im¬ 
age  and  Figure  6b  shows  the  groups  that  will  be  gener¬ 
ated  by  the  algorithm.  Finally,  Figure  6c  shows  some 
other  groups  that  are  possible  from  the  arrangement  in 
Figure  6a  but  are  not  generated  by  the  algorithm.  The 
groups  generated  by  the  algorithm  correspond  to  a  left 
to  right,  bottom  to  top  scan  of  the  line  segments  in  the 
image.  Such  a  scan  often  produces  groups  that  resemble 
the  groups  we  perceive  using  a  frame  of  reference  with 
the  origin  at  the  left  hand  bottom  corner  of  the  image. 

In  general,  if  a  large  number  of  lines  fall  within  the 
specified  neighborhood  of  a  line  (in  Step  2a),  the  pos¬ 
sible  combinations  of  lines  obeying  all  4  constraints 
could  become  very  large.  The  grouping  algorithm  de¬ 
scribed  above  generates  only  a  subset  of  such  groups, 
and  in  some  sense,  therefore,  does  a  filtering  operation. 
We  mentioned  earlier  in  Section  2.1  that  grouping  ap¬ 
proaches  that  filtered  groups  to  keep  them  to  a  small 
number  could  cause  a  recognition  system  that  subse¬ 
quently  uses  these  groups  to  make  unnecessary  false 
negatives.  We  now  show  that  this  does  not  happen 
with  the  above  grouping  eilgorithm.  For  this,  we  no¬ 
tice  that  the  closely-spaced  line  groups  satisfying  all  4 
constraints  of  {t^crossjtalongjtiQcal— orient  itglobal—orient'} 
are  subsets  of  groups  satisf}dng  the  first  3  constraints 
{tacroseitaiong,tioeai-orieni}  (called  main  groups  here). 
The  grouping  algorithm  generates  only  some  of  the  pos¬ 
sible  subsets,  but  such  groups  (called  aggressive  groups, 
henceforth)  generate  a  cover  of  the  main  group.  That 
is,  every  line  of  a  main  group  belongs  to  some  aggressive 
group.  Suppose  that  the  groups  are  fed  to  a  recogni¬ 
tion  system.  Assuming  an  alignment  style  of  recognition 
(such  as  the  linear  combination  of  views  method  [32]),  we 
know  that  at  least  4  matching  features  must  be  found  to 
solve  for  the  pose  of  the  object.  Since  parallel  line  groups 
in  the  image  that  come  from  parallel  lines  in  space  rep¬ 
resent  coplanar  points,  we  may  need  two  such  groups  to 
derive  these  features.  Let  us  assume  that  each  group 
provides  two  features  and  that  the  features  are  derivable 
from  a  single  line  in  each  of  the  groups  (the  end  points 
of  a  line  are  the  features,  say).  If  there  existed  a  pair  of 
closely-spaced  parallel-line  groups  in  the  image  obeying 
all  4  constrciints  that  were  the  correct  pair  of  groups  (i.e., 
they  contained  sufficient  number  of  features  to  recognize 
the  object)  but  were  not  generated  by  the  grouping  algo- 
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rithm,  then  a  recognition  system  using  the  groups  given 
by  the  algorithm  could  make  false  negatives.  But  since 
the  aggressive  groups  form  a  cover,  each  correct  group 
has  partial  overlap  with  at  least  one  aggressive  group 
suggesting  that  those  pairs  of  aggressive  groups  would 
also  be  the  correct  groups  containing  sufficient  features 
to  recognize  the  object  thus  preventing  a  false  negative 
identification. 

Thus  the  above  grouping  algorithm  keeps  the  number 
of  groups  small  by  filtering  possible  groups,  but  at  the 
same  time,  prevents  unnecessary  false  negatives  during 
recognition  due  to  insufficient  number  of  groups  being 
produced. 

4  Data-driven  Selection  using  Line 
Groups 

We  now  discuss  the  use  of  closely-spaced  parallel-line 
groups  to  perform  data-driven  selection.  The  goal  of 
data-driven  selection  is  to  isolate  regions  in  an  image 
that  are  likely  to  come  from  a  single  object  based  on  in¬ 
formation  available  in  the  image  and  some  a  priori  knowl¬ 
edge  about  scenes.  For  a  given  choice  of  thresholds,  not 
all  the  groups  generated  by  the  above  algorithm  repre¬ 
sent  useful  structures  in  the  scene  as  can  be  seen  from 
the  examples  in  Figures  4  and  5.  Some  of  the  groups  may 
span  more  than  one  object,  while  others  come  from  spu¬ 
rious  line  segments,  or  scene  clutter  rather  than  objects 
of  interest  in  the  scene.  For  the  purposes  of  recognition, 
it  would  be  useful  to  order  and  consider  only  some  of  the 
more  reliable  ones  from  these  groups.  In  keeping  with 
our  generad  paradigm  of  data-driven  selection,  we  order 
the  groups  using  a  saliency  measure  and  select  a  few  of 
the  salient  groups.  In  this  section,  therefore,  we  describe 
a  measure  of  saliency  for  the  line  groups  and  then  discuss 
the  utility  of  salient  group-based  selection  in  recognition. 

4.1  Saliency  of  parallel-line  groups 

As  in  the  development  of  color  and  texture  region 
saliency,  the  focus  in  designing  a  measure  of  saliency 
of  pairallel-line  groups  will  be  on  capturing  the  sensory 
component  of  distinctiveness.  Thus  those  properties  of 
lines  that  are  commonly  perceived  and  fairly  general  will 
be  considered.  The  strategy  for  assembling  the  saliency 
measure  is,  as  before,  to  record  the  factors  affecting 
saliency  and  to  combine  them  appropriately  in  a  way 
that  reflects  their  importemce.  Unlike  in  the  case  of  color 
and  texture  saliency,  however,  the  saliency  measure  for 


Figure  6:  Example  to  illustrate  some  of  the  line  groups  that  are  not  generated  by  the  grouping  algorithm,  (a) 
An  arrangement  of  closely-spaced  parallel  lines,  (b)  The  groups  generated  by  the  algorithm  shown  within  the  two 
rectangular  boxes.  The  asterisk  mark  indicates  the  starting  line  segment  used  to  assemble  the  line  groups,  (c) 
Another  set  of  groups  possible  from  the  arrangement  of  (a)  that  also  obey  all  the  four  grouping  constraints.  Note  the 
overlap  between  these  groups  and  those  generated  by  the  grouping  algorithm. 
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line  groups  is  designed  to  emphasize  the  inherent  relia¬ 
bility  of  groups  more  than  the  match  with  our  perceptual 
judgment  of  their  importance. 

4.1.1  Factors  affecting  saliency  of  line  groups 

Among  the  properties  of  line-groups  that  are  often  ob¬ 
served  are  the  length  of  the  constituent  lines,  the  region 
spanned  by  them,  the  number  of  constituent  lines,  and 
their  overall  orientation  span.  These  properties  of  line 
groups  were  chosen  as  the  factors  affecting  saliency  using 
the  following  rationale: 


1.  Length  of  constituent  lines  (L):  The  length  of  the 
constituent  lines  is  chosen  as  a  factor  because  groups 
with  both  very  long  or  very  short  lines  are  undesirable 
from  the  point  of  recognition.  Very  long  lines  in  an  im¬ 
age  are  more  likely  to  sprin  multiple  objects,  while  very 
short  lines  often  tend  to  be  due  to  spurious  line  seg¬ 
ments  resulting  from  scene  clutter  or  from  a  very  fine 
line-segment  approximation.  Since  a  group  has  lines  of 
varying  length,  the  average  length  of  lines  in  the  group 
is  taken  as  a  measure  of  the  length  of  the  group. 


2.  Region  span  of  a  group  (jR):  The  region  spanned  by  a 
line  group  can  give  indications  of  its  reliability.  A  large 
region  span  often  indicates  a  group  spanning  more  than 
one  object.  We  measure  the  region  spanned  by  a  closely- 
spaced  parallel-line  group  by  the  area  of  the  convex  hull 
of  the  end  points  of  the  constituent  lines. 


3.  Number  of  constituent  lines  (N):  Groups  with  a  very 
large  number  of  lines  may  not  be  entirely  desirable  from 
the  point  of  recognition  as  they  could  contain  a  large 
number  of  features.  Also,  such  groups  could  potentially 
span  multiple  objects.  A  sparse  group  (with  one  or  two 
lines),  on  the  other  hand,  may  not  indicate  any  mean¬ 
ingful  structure.  Either  way,  the  number  of  constituent 
lines  in  a  group  can  affect  its  saliency. 


4.  Orientation  span  (6):  A  low  orientation  span  indi¬ 
cates  a  greater  amount  of  parallelism  in  a  group.  Since 
the  aum  in  grouping  lines  here  is  to  capture  instances  of 
closely-spaced  pairallelism  on  objects  in  a  scene,  groups 
exhibiting  a  greater  degree  of  parallelism  are  more  likely 
to  indicate  a  meainingful  structure.  This  also  follows 
from  the  viewpoint  invariance  argument  of  Lowe  that 
we  mentioned  in  Section  2.1,  namely,  that  a  group  of 
parallel  lines  is  unlikely  to  have  airisen  from  an  accident 
of  viewpoint,  thereby  making  them  more  likely  to  come 
from  either  a  single  object  or  a  single  structure.  The 
orientation  span  is  measured  by  recording  the  meiximum 
difference  in  orientation  between  lines  in  a  group. 


4.1.2  Weighting  functions  for  factors  affecting 
saliency 

To  develop  a  measure  of  saliency  for  the  line  groups, 
each  of  the  factors  must  be  weighted  to  appropriately 
reflect  their  individual  contributions  in  deciding  the 
S2iliency  of  a  group.  The  form  of  the  weighting  func¬ 
tion,  in  most  cases,  was  derived  using  three  criteria:  It 
should  (a)  reflect  the  likelihood  of  a  group  coming  from  a 
single  object,  (b)  it  should  be  a  smooth  function  so  that 
discontinuities  do  not  indicate  an  abrupt  change  in  judg¬ 
ment,  (c)  it  should  be  verifiable  from  statistical  exper¬ 
iments.  The  weighting  functions  chosen  for  the  factors 
are  as  follows: 


1.  Length  of  constituent  lines  (L):  Using  the  rationale 
given  earlier,  the  weighting  function  is  chosen  to  de- 
emphasize  both  very  long  and  very  short  lines,  while 
giving  about  equal  importance  to  intermediate  length 
lines.  The  decision  of  very  short  lines  is  made  on  an 
absolute  basis,  i.e.,  lines  shorter  than  5  pixels  are  con¬ 
sidered  very  short,  while  long  lines  are  decided  relative 
to  the  size  of  the  image  (by  choosing  the  diagonal  length 
in  the  image  as  the  normalizing  factor).  The  weighting 
function  is  as  follows: 

t 

I  _In{l-Lr,)  0<L<li 
hiL)  =  I  1  -  l\  <  L  and  Ln  <  h  (3) 

[  /2  <  L„  <  1.0 

where  L„  =  Lmax  =  diagonal  length  of  the  image, 
and  the  various  thresholds  Eue  /i  =  5  (pixels),  I2  =  0.4, 

_  einio  -  _  StnlO 
Cl  - - j7'^>C2  -  -J^,C3  - 

The  form  of  the  weighting  function  was  derived  by 
performing  the  following  experiments.  Groups  were 
formed  from  several  edge  images  using  the  grouping  al¬ 
gorithm  and  the  average  line  lengths  of  groups  were 
recorded.  The  scenes  of  the  images  varied  in  complex¬ 
ity  having  different  amounts  of  scene  clutter,  contained 
several  objects,  and  showed  illumination  artifacts  such 
as  specularities,  interrefiections,  etc.  Figures  2a  and  5a 
shows  examples  of  some  typical  images  tried.  A  his¬ 
togram  of  the  number  of  groups  with  a  given  normalized 
length  Ln  (using  200  bins)  was  plotted.  From  this,  the 
number  groups  that  came  from  a  single  object  and  had 
normalized  line  length  falling  in  a  given  bin  were  noted. 
The  ratio  of  the  number  of  groups  of  a  given  Ln  coming 
from  a  single  object  to  the  total  number  of  groups  of  that 
line  length  was  taken  to  represent  the  weighting  function 
fi{L).  This  ratio  was  plotted  against  L„  and  smooth 
functions  were  fit  to  the  resulting  curve.  These  functions 
were  described  by  the  parameters  ci, 02,03.  Finally,  the 
thresholds  li ,  were  found  from  the  breakpoints  in  this 
ratio  curve. 


2.  Region  span  of  a  group  (R):  The  weighting  function 
for  the  region  sp2Ln  was  chosen  to  emphasize  small  and 


compact  groups.  The  form  of  the  weighting  function 
was  derived  by  performing  studies  similar  to  the  one  de¬ 
scribed  above.  Here,  the  ratio  of  the  number  of  groups 
with  a  given  normalized  region  spein  that  came  from  a 
single  object  to  the  total  number  of  groups  with  the  given 
span  was  teiken  to  represent  the  weighting  function.  The 
weighting  function  derived  from  this  ratio  was: 


In(l-Rn) 

C4 

0  <  ^  <  ri 

1  -  e-'*" 

ri<  Rn<r2 

S2  -  C6/n(l  -R„  +  r2) 

r2<  Rn<r3 

53e-=T(«--rs) 

r3<  Rn  <r4 

0 

r4  <  Rn  <  1.0 

( 

V 

,  and  Rmax  =  image  size,  ri  =  0 

f2{R)  =  < 


(4) 

where  Rn  = 

r2  =  0.4,  ra  =  OrS,  r4  =  0.75,  sj  =  0.8,  S2  =  1.0,  S3  = 
0.7,  S4  =  10-3  = 

Here  again  the  thresholds 
are  chosen  in  a  manner  similar  to  the  one  described  for 
the  weighting  function  for  the  length  of  constituent  lines. 


3.  Number  of  constituent  lines  (N:)  Since  the  average 
size  of  a  group  is  small  in  the  groups  generated  by  the 
grouping  algorithm,  the  weighting  function  is  chosen  to 
emphasize  densely  packed  groups,  i.e.  groups  with  a 
larger  number  of  constituent  lines,  as  they  often  indicate 
some  textural  information.  If  such  groups  span  a  large 
area  then  they  will  be  de-emphasized  in  the  weighting 
function  f2(R)  for  the  region-span.  The  weighting  func¬ 
tion  chosen  was: 


fsiN)  = 


N 

^max 


(5) 


where  N  =  number  of  constituent  lines  in  a  group,  and 
Nmax  =  maximum  number  of  lines  in  any  line  group  in 
the  given  edge  image. 


4.  Orientation  span  (0):  Using  the  rationale  given  ear¬ 
lier,  the  weighting  function  here  is  designed  to  empha¬ 
size  groups  showing  a  greater  degree  of  parallelism,  i.e., 
a  smaller  orientation  span.  To  avoid  unfair  bias  towards 
groups  of  single  lines  (which  will  have  an  orientation 
span  of  zero),  we  assign  a  small  penalty  toward  single 
line  groups.  The  resulting  choice  of  function  is: 


/4(©)  =  { 


0.1 
(1  + 


e(l.O-eti) 

^glob^l—o  ri«fi 


7) 


0  =  0  and  N  =  1 
otherwise 


where  ©  is  the  orientation  sp2ui,  and  cti  =0.4. 


4.1.3  Saliency  measure  for  a  closely-spaced 
parallel-line  group 

The  saliency  measure  for  a  closely-spaced  group  of 
line  segments  is  obtained  by  combining  the  weighting 
functions  reflecting  the  contributions  from  the  various 
factors.  Since  the  factors  record  independent  properties 


15 


of  line  groups,  we  chose  to  combine  them  linearly  to  give 
the  following  saliency  measure: 


Saliency  of  a  line  group  =  fi{L)+f2{R)+f3{N)+f4(e) 

(7) 

4.1.4  Results 

We  now  illustrate  the  use  of  the  saliency  measure 
to  judge  the  reliability  of  closely-spaced  parallel-line 
groups.  Figures  4f-i  show  the  line  groups  found  by  the 
grouping  algorithm  in  the  image  of  Figure  4a.  Among 
these,  the  40  most  salient  groups  found  using  the  above 
saliency  measure  are  shown  in  Figure  4j.  As  can  be  seen 
from  the  figure,  all  of  the  40  seilient  groups  come  from  sin¬ 
gle  objects.  Figure  5  shows  another  example  in  which  the 
line  groups  generated  sue  shown  in  Figures  5f-i  and  the 
top  40  salient  groups  among  them  are  shown  in  Figure  5j. 
Here  only  two  of  the  salient  groups  did  not  come  from 
single  objects.  Table  2  shows  the  results  of  performing 
saliency  experiments  on  a  number  of  images  whose  aver¬ 
age  complexity  is  indicated  by  the  number  of  constituent 
line  segments  listed  in  the  table.  Here,  the  last  column 
lists  the  percentage  of  unreliable  groups  in  the  top  100 
salient  groups  found  using  the  saliency  measure.  From 
these  studies  we  conclude  that  the  saliency  measure  cap¬ 
tures  reliable  groups  and  cam,  therefore,  he  useful  in  a 
data-driven  selection  mechainism  for  recognition. 

In  the  discussion  so  far,  we  have  not  analyzed  the 
extent  to  which  the  groups  selected  by  the  saliency  mea¬ 
sure  match  our  perceptual  judgment  of  the  importance 
of  such  groups.  We  chose  not  to  emphasize  this  aspect 
for  several  reasons.  First,  in  an  edge  image,  other  rela¬ 
tions  in  addition  to  closely-spaced  parallelism,  may  exist 
between  lines.  For  example,  long  smooth  curves  may 
be  more  salient  than  paradlel-line  groups  in  an  edge  im¬ 
age.  For  comp2uing  the  performance  of  the  saliency  mea¬ 
sure,  the  subjects  should  be  made  to  look  at  only  closely- 
spaced  line  groups  and  ignore  other  cues  for  grouping,  a 
task  difficult  to  achieve  in  practice.  Even  if  scenes  show¬ 
ing  only  instances  of  closely-spaced  parallelism  were  ex¬ 
amined,  there  is  the  additional  problem  due  to  the  group¬ 
ing  algorithm  generating  only  a  subset  of  the  possible 
groups.  Thus  not  all  the  groups  perceived  by  a  subject 
may  be  generated  emd  this  edfects  the  groups  selected  by 
the  saliency  measure.  Finally,  perceptual  judgments  may 
be  based  on  a  collection  of  groups  of  different  orientation 
(this  could  indicate  groups  of  curves,  for  example) ,  and 
this  is  not  considered  by  the  saliency  measure. 

4.2  Use  of  salient  line  groups-based  selection 
in  recognition 

Data  driven  selection  based  on  sedient  line  groups  is  pri¬ 
marily  useful  when  the  object  of  interest  has  at  least 
one  parallel-line  group  that  appears  among  the  selected 
salient  line  groups.  In  such  cases,  the  search  for  data 
features  that  match  model  features  can  be  restricted  to 
salient  groups  thus  avoiding  needless  search  in  other  ar¬ 
eas  of  the  image.  In  order  for  a  model  group  to  be  found 
among  the  salient  line  groups,  however,  it  should  first 
be  generated  by  the  grouping  algorithm.  That  is.  the 
choice  of  the  four  constraints  characterizing  an  itnage 


S.No. 

Image 

Size 

Num. 

Lines 

Group 

Constraints 

Num. 

Groups 

Avg.  Salient 
Group  Size 

Avg.  Salient 
Group  Area 

%  Unreliable 
Groups 

1. 

320  X  576 

395 

<  10, 5, 6, 10  > 

219 

6 

0.009 

1 

2. 

256  X  416 

756 

<  6, 5, 6, 10  > 

382 

8 

0.006 

2 

3. 

224  X  416 

316 

<  5,5,6, 10> 

207 

6 

0.008 

7 

4. 

200  X  492 

1232 

<  5, 10, 9, 12  > 

552 

4.25 

0.003 

3 

4. 

232  X  576 

884 

<  5, 5, 6, 10  > 

454 

5.3 

0.004 

6 

Table  2:  Characteristics  of  salient  closely-spaced  line  groups  ranked  by  the  saliency  measure  described  in  text.  In 
each  case,  the  top  100  salient  groups  are  considered.  The  number  of  groups  listed  here  include  single  line  groups. 


group  should  be  such  that  a  model  group,  if  it  exists  in 
the  image,  can  be  captured  by  these  constraints.  Since 
no  specific  knowledge  of  model  objects  can  be  used  in 
data-driven  selection,  the  thresholds  can  be  set  based  on 
some  rough  a  priori  knowledge  about  expected  objects 
in  scenes,  the  distances  at  which  they  are  imaged,  and 
some  general  knowledge  about  the  parameters  of  the  3D 
structures  that  are  meant  to  be  captured  in  line  groups 
in  the  images  of  such  scenes.  Among  the  thresholds,  the 
local  and  global  orientation  thresholds,  t/oca/- orient  and 
t global- orient,  are  essentially  independent  of  the  objects 
in  the  library,  and  can  be  chosen  based  on  an  analysis  of 
the  imaging  noise  and  the  noise  in  line-segment  approx¬ 
imation.  Since  a  group  captures  almost  parallel  lines, 
there  is  not  much  leeway  in  their  choice,  in  that  they 
can  be  only  low  values.  We  chose  to  model  this  noise  by 
allowing  5-7  degree  skew  in  between  lines  (tioeal-orient) 
and  an  overall  skew  (tgiobai-orient)  of  10  degrees.  The 
values  of  taeroat  and  taiong  however,  have  a  major  ef¬ 
fect  on  the  lines  that  are  ultimately  grouped,  and  can¬ 
not  in  general,  be  chosen  independent  of  objects  in  the 
library.  Larger  values  of  these  thresholds  edlow  some¬ 
what  widely-spaced  peirallelism  to  be  captured,  such  as 
the  parallelism  inherent  in  the  contour  of  the  triangular 
block  in  Figure  2b.  However,  this  would  also  decrease  the 
reliability  of  the  groups,  as  it  allows  lines  belonging  to 
separate  objects  to  be  grouped.  Since  our  aim  (at  least 
in  data-driven  selection)  is  to  capture  letter  and  wooden 
texture  occurring  on  objects  in  indoor  scenes,  we  found 
that  for  the  distances  at  which  the  objects  are  typically 
imaged,  an  interline  separation  (toeroj* ,  taiong)  of  5  to  10 
pixels  is  sufficient  for  capturing  most  such  parallel-line 
groups  in  images.  Better  methods  for  choosing  these 
thresholds,  could  be  devised,  however. 

4.2.1  Search  reduction  using  salient  line  groups 

We  now  estimate  the  search  reduction  that  can  be 
achieved  by  using  salient-line  groups-based  data-driven 
selection.  Following  the  analysis  of  the  number  of 
matches  using  grouping  given  in  Section  2,  if  only  S 
salient  groups  are  retained  for  M,-  image  groups,  then  the 
number  of  matches  to  be  tried  using  the  4-point  align¬ 
ment  scheme  of  recognition  is  J2j=i  nifnf4!) 

where  m,-  and  nj  axe  the  number  of  features  in  the  model 
and  the  selected  salient  image  groups,  respectively.  To 
estimate  the  number  of  matches  using  such  salient  line 
groups,  we  chose  a  few  model  objects  exhibiting  closely- 
spaced  parallelism  between  lines  on  the  object,  generated 
line  groups  using  the  grouping  algorithm  and  retained 
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a  few  of  the  groups.  By  placing  these  objects  in  var¬ 
ious  scenes,  md  using  the  grouping  algorithm  and  the 
saliency  measure,  we  retained  a  few  (about  40)  salient 
line  groups  in  the  images  of  these  scenes.  The  number 
of  features  (i.e.  the  end  points  of  line  segments)  in  both 
model  and  data  groups  were  recorded.  The  number  of 
matches  with  sedient  groups-based  selection  was  found 
using  the  above  formula.  For  purposes  of  comparison, 
the  number  of  matches  without  any  grouping  was  also 
computed  using  the  formula  0{M^N^)  where  M  and  N 
are  the  total  number  of  features  found  in  all  the  model 
and  data  line  groups.  The  results  are  summarized  in  Ta¬ 
ble  3.  As  can  be  seen  from  the  table,  the  search  is  always 
considerably  lower  when  salient  groups-based  selection  is 
done  prior  to  recognition.  The  number  of  ipatches  with 
this  type  of  selection  scheme  can  still  be  large,  however, 
as  all  pairings  of  model  and  salient  image  groups  are 
tried.  Also,  some  of  the  salient  groups  are  Icirge  in  size 
thus  increasing  the  number  of  matches  that  need  to  be 
tried  within  a  pair  of  groups. 

5  Model-driven  Selection  using  Line 
Groups 

So  far  we  have  considered  the  use  of  closely-spaced  par¬ 
allel  line  groups  for  data-driven  selection  which  required 
the  object  of  interest  to  have  a  sedient  line  group.  Fur¬ 
ther,  it  was  assumed  that  such  a  group  could  be  detected 
by  the  use  of  some  default  threshold  values  for  the  con¬ 
straints  cheiracterizing  closely-spaced  line  groups.  This 
will  not  be  of  much  help  when  the  object  of  interest  has 
closely-spaced  parallel  line  groups  but  they  are  either  not 
salient  or  cannot  be  captured  using  the  default  thresh¬ 
olds.  In  such  cases,  the  description  of  the  line  groups 
present  on  the  model  object  can  be  used  to  perform  selec¬ 
tion.  We  now  describe  one  such  line  groups-based  model- 
driven  selection  mechanism.  The  approach  adopted  here 
is  to  selectively  generate  line  groups  in  the  image  based 
on  the  description  of  closely-spaced  paredlel  line  groups 
on  the  model  object.  Thus  the  group  generation  pro¬ 
cess  is  constrained  here  so  that  only  the  likely  matching 
groups  are  generated.  Once  the  candidate  matching  line 
groups  in  the  image  are  obtained,  recognition  can  pro¬ 
ceed  by  examining  pairs  of  model  and  image  groups  as 
usual. 

The  criteria  developed  for  a  model-driven  selection 
mechanism  in  the  earlier  work  [28]  are  also  relevant  to 
line  groups-based  model-driven  selection.  That  is.  it 
must  be  sufficiently  selective  to  avoid  considering  ob- 


M 

N 

Mg 

Avg.  mi 

Group 

Constraints 

■ 

Num.  Matches 

No  Selection 

1. 

466 

1768 

6 

<  5,5,6, 10  > 

2. 

1768 

4 

<  10,5,6, 10  > 

R 1 

3. 

Bnl 

2464 

10 

4 

<  10,5,6,10> 

10 

EE ! 

4.66x10^° 

4. 

358 

1453 

8 

16 

<  2,2,6,10> 

18 

El ! 

1.38x10^^ 

5. 

790 

5 

23 

<  10,5,6, 10> 

12 

E&l! 

4.46x10^^ 

Table  3:  Estimated  search  reduction  using  salient  line  groups-based  selection.  The  terms  M,  N,  are  as 

explained  in  text.  Here  Ng  =  4O,  i.e.,  the  top  40  salient  groups  are  retained  for  selection. 


viously  impossible  matches,  but  at  the  same  time,  be 
suflSciently  flexible  to  take  into  account  the  various  prob¬ 
lems  in  imaging  that  may  cause  a  model  line  group  in  an 
image  to  appear  different  from  its  origincd  description. 
The  object  may  appeeir  different  in  a  scene  because  it 
has  undergone  pose  changes,  or  because  it  is  occluded, 
or  because  illumination  changes  as  well  as  artifacts  in  the 
scene  such  as  specularities,  interreflections  have  altered 
the  appearance  of  the  object.  These  cheinges  (called  ob¬ 
servation  conditions,  henceforth)  can  also  alter  the  ap¬ 
pearance  of  the  line  groups  on  the  model  object.  We 
first  examine,  therefore,  the  effects  of  these  changes  on 
the  model  line  groups.  This  will  then  be  used  to  design 
a  description  of  model  line  groups  as  well  as  a  strategy 
for  generating  matching  line  groups  in  the  image. 

5.1  Effect  of  observation  conditions  on  a  model 
line  group 

Consider  a  closely-spaced  parallel  (3D)  line  group  on 
a  model  object.  It  can  be  specified  by  a  tuple  < 
t3d-aeros>,h<i-aiong  >  with  the  following  interpretation: 
Between  any  two  consecutive  parallel  lines  in  the  model 
3D  group,  the  across-the-line  distance  dsa-aeross  is  no 
more  than  the  threshold  tsi-aerots  ^d  the  dong-the- 
line  distance  d^j^  along  is  no  more  than  the  threshold 
tsd— along-  The  distances  d^^^aerots  and  dsj— along  for 
parallel  lines  in  3D  are  defined  in  a  way  analogous  to  that 
of  daeross  and  daiong  given  in  Section  3.1.1^.  We  now 
analyze  the  effect  of  observation  conditions  on  the  ap¬ 
pearance  of  such  3D  line  groups  on  model  objects  when 
they  placed  in  a  scene. 

Pose  changes:  If  the  allowed  transformation  of  the  model 
object  is  restricted  to  a  3D  aifine  transformation,  then 
under  orthographic  projection  and  scale  (to  approximate 
perspective  projection)  and  assuming  no  imaging  noise 
and  no  errors  in  line  segment  approximation,  it  is  known 
that  closely-spaced  parallel  3D  line  groups  project  to  a 
set  of  closely-spaced  peirallel  lines  in  the  image  [30]^. 

®The  distance  did-acrots  is  simply  the  distance  between 
two  consecutive  parallel  lines  defined  as  the  length  of  pro¬ 
jection  of  the  end  point  of  one  3D  line  on  the  other.  The 
along-the-line  distance  dsd-aiong  is  as  defined  for  2d-lines  ex¬ 
cept  that  the  distance  dmin,  L(Pt),  and  L{ps)  are  distances 
between  points  in  3D. 

^If  the  transformation  takes  some  of  the  object  out  of  view, 
then  this  can  be  treated  as  the  case  when  some  projected 
parallel  lines  coincide. 


Further,  the  order  of  the  lines  in  the  group  is  preserved, 
although  the  resulting  orientation  of  the  parallel  lines 
in  the  image  can  be  arbitrary.  The  inter-line  spacing 
datrott  and  daiong  between  the  projected  lines  will,  how¬ 
ever,  be  different  from  the  inter-line  spacing  dzd-acron 
and  dzd-aiong  between  the  corresponding  3D  lines.  If 
no  scale  chtmge  has  occurred  during  the  transforma¬ 
tion,  then  the  spacing  between  the  projected  parallel 
lines  in  the  image  can  only  decrease.  This  is  because 
daero$t  forms  a  side  of  a  right  triangle  whose  hypotenuse 
is  the  orthographic  projection  of  the  inter-line  spacing 
dzd-acTott  (as  shown  in  Figure  7),  and  will  always  be 
less  than  or  equal  to  dzd^aerots  •  Using  ‘a  similar  ar¬ 
gument,  we  can  show  that  the  along-the-lihe  spacing  in 
the  image  daiong  is  less  than  or  equal  to  dzd- along-  If  the 
transformation  includes  scale  changes,  then  the  inter¬ 
line  spacing  between  parallel  lines  varies  proportional 
to  the  scale.  That  is,  for  a  scede  change  s  (s  >  1.0  or 
<  1.0)  we  have  docro»>  <  s  *  dzd-aero$$  and  similarly 
daiong  <  5  *  dzd-aero$t-  Thus  the  effect  of  pose  changes 
is  to  vary  the  inter-line  spacing  between  lines  while  still 
mmntaining  the  property  of  peurallelism. 

Occlusions:  The  most  common  effect  of  occlusions  is  to 
corrupt  the  projected  model  line  group.  That  is,  de¬ 
pending  on  the  geometry  of  the  occlusion,  some  lines  in 
the  group  may  either  parti2dly  or  totally  disappear.  But 
unless  a  group  is  completely  occluded,  the  visible  lines 
of  the  group  maintain  the  same  relative  ordering  and  the 
inter-line  spacing  is  dictated  by  the  pose  changes  under¬ 
gone  by  the  model.  In  rare  cases,  when  the  occluding 
object  has  similar  closely-spaced  parallel  line  groups,  it 
may  cause  the  two  groups  to  be  merged,  and  may  even  af¬ 
fect  the  inter-line  spacing  of  the  model  line  group.  Thus 
the  effect  of  occlusions  on  a  model  line  group  is  mainly 
to  change  the  number  of  constituent  lines  in  the  model 
groups,  and  in  rare  cases  to  even  alter  the  inter-line  spac¬ 
ing  in  the  groups. 


Illumination  changes  and  other  imaging  artifacts: 

When  the  wavelength  cheiracteristics  of  the  light  source 
illuminating  the  scene  is  different  from  the  one  illumi¬ 
nating  the  model  object,  it  may  cause  a  change  in  the 
appMent  color  of  the  object’s  surface.  But  since  we  are 
looking  at  an  edge  image  to  generate  the  groups,  the 
edges  tend  to  remun  more  or  less  stable.  When  the 


17 


®Tliis  is  also  true  when  perspective  projection  is  approxi¬ 
mated  by  orthographic  projection  iind  scale. 


Figure  7:  Illusiraiion  to  show  that  the  distance  between  the  projections  of  SD  parallel  lines  is  less  than  or  equal  to  the 
3D  distance  between  the  lines.  The  projection  of  the  3d  distance  dad-aerott  (line  ab)  is  given  by  the  line  a  b  .  The 
length  of  a  b  is  <  length  ofab  since  ab  =  •y/(a:i  —  12)^  +  (s/i  —  y2Y  +  (^i  —  ^2)^  nnd  a'b'  =  y/{xi  —  +  (j/i  —  2/2)^- 

The  distance  between  the  projected  lines  given  by  a  c,  is  <  a  b  by  the  hypotenuse  of  a  right  triangle  rule. 
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light  source  location  is  changed,  however,  then  depend¬ 
ing  on  the  position  of  the  3D  parallel  line  groups  relative 
to  the  light  source,  lines  in  the  projected  group  may  be 
either  partially  or  totally  hidden  in  shadow.  Further,  if 
the  surface  of  the  object  is  specular,  then  specularities 
occurring  in  the  region  of  the  line  group  may  cause  the 
group  to  be  corrupted  by  the  partial  or  total  masking  of 
the  lines  in  the  group.  Finally,  interreflections  can  cause 
spurious  line  segments  to  appear  as  part  of  the  model 
line  group  depending  on  the  pattern  on  the  object  that 
is  being  reflected  from  the  model’s  surface.  Thus  the 
effect  of  illumination  changes  is  similar  to  occlusions  in 
that  both  the  number  of  constituent  lines  and  the  inter¬ 
line  spacing  in  the  model  groups  may  be  changed. 

5.2  Model  line  group  description 

We  now  develop  a  description  of  the  model  line  groups 
taking  into  account  the  effect  of  the  observation  con¬ 
ditions.  From  the  above  analysis,  we  know  that  a 
closely-spaced  strictly  parallel  line  group  on  the  model 
specified  by  <  hd-acro>sMd-aiong  >  also  occurs  as 
a  group  of  parallel  lines  in  an  image  with  an  inter¬ 
line  spacing  that  is  decided  by  the  scale  change  in¬ 
volved  in  the  undergone  transformation.  By  restrict¬ 
ing  the  scale  changes  to  lie  between  <  si ,  S2  >  (with 
Si  <=  1.0  and  >  1.0),  the  variation  in  the  inter¬ 
line  spacing  daerosi  that  can  be  handled  for  any  model 
line  group  appearing  in  any  image  can  be  restricted  to 
lie  in  the  range  [si  *  t3d-aeross,S2  *  t3<i-aero»»]  and  simi¬ 
larly,  the  spacing  daiong  to  lie  between  [si*t3<i_a(onjiS2* 
izd-aiong\-  So  fat,  the  effect  of  imaging  noise  and 
errors  in  line  segment  approximation  were  not  taken 
into  account.  As  we  remarked  earlier,  their  effect  is 
to  cause  the  lines  in  the  model  group  to  be  slightly 
skewed  in  the  image.  The  thresholds  tm-iocai-orient 
and  tm- global- orient  Can  be  used  to  specify  the  toler¬ 
able  inter-line  skew  and  the  overall  skew  in  the  line 
group.  The  resulting  description  of  a  model  line  group 
as  it  appears  in  any  image  can  be  given  by  the  tuple  < 

tzd  —  across  i  ^3d— along  i  ^m— local— orient ,  ^m-global— orient  t  ^2 

Different  line  groups  on  the  model  will  differ  in  the 
first  two  terms  (as  the  thresholds  Uoeai-orient  and 
t global- orient  are  independent  of  the  line  group),  and  the 
scale  chcinges  eire  specified  with  respect  to  the  model  ob¬ 
ject. 

To  generate  candidate  line  groups  in  a  given  image 
using  the  above  model  description,  however,  the  inter¬ 
line  spacing  and  the  tolerable  scale  changes  must  be  ex¬ 
pressed  in  terms  of  pixel  spacing.  For  this,  the  distance 
at  which  the  model  is  imaged  for  building  the  general 
model  description  cein  be  used  as  the  reference  distance. 
That  is,  the  pixel  spacing  corresponding  to  across 
and  1 3d- along  Can  be  taken  to  be  the  one  existing  between 
the  projected  lines  in  the  image  when  the  model  is  placed 
at  the  reference  distance  and  oriented  in  such  a  way 
that  the  model  line  group  is  parallel  to  the  image  plane. 
By  moving  the  object  closer  or  farther  than  the  refer¬ 
ence  distance  by  the  scale  factors  si  and  S2  respectively, 
the  corresponding  change  in  the  pixel  spacing  can  be 
recorded.  If  tm-2d-across  and  tm-2d-aiong  represent  the 
pixel  spacing  corresponding  to  tsd-across  and  t3d-aiong 


respectively,  and  pi  and  po  represent  the  change  in  pixel 
spacing  corresponding  to  the  scale  changes  si  and  S2, 
then  the  description  of  the  model  line  group  that  can  be 
used  to  generate  the  line  groups  in  the  image  becomes  < 

im— 2d— across  >  ^m— 2d— along  1 1^  — local— orient  i  tm- global— orient  i  Ph  P2 

5.3  Selective  generation  of  matching  line 
groups 

We  now  present  an  algorithm  for  selectively  generating 
line  groups  in  the  image  that  match  a  given  model  line 
group  description.  Since  the  model  description  places  a 
bound  on  the  tolerable  scale  and  pose  changes,  the  basic 
strategy  is  to  generate  line  groups  with  successively  in¬ 
creasing  inter-line  (both  daeross  and  daiong)  spacing  un¬ 
til  the  upper  bound  specified  in  the  model  description  is 
reached.  That  is,  given  a  model  line  group  description  < 

tm— 2d— across  i  tm— 2d— along  ^  tm— local- orient ,  tm— global— orient  >Pl>P2 

the  maximum  across-the-line  spacing  allowed  between 
the  lines  varies  from  tm  — 2d— across  ~~Pl  fo  tm  — 2d— across  d” 

P2  (and  similarly  for  the  along-the-line  spacing).  The 
matching  groups  are  generated  by  hypothesizing  a 
value  of  spacing  between  the  lines  lying  in  the  above 
range  and  generating  all  groups  with  inter-line  sepa¬ 
ration  specified  by  that  value.  In  particular,  succes¬ 
sive  integer  pixel  spacing  from  tm-2d-across  -  Pi  to 
tm-2d-across  +P2  are  Used  to  generate*  the  groups. 

Such  groups,  called  augmented  closely-s^iaced  paral¬ 
lel  line  groups,  can  be  specified  by  the  tuple  < 

tacross  —  low  t  taeross  >  tolong—low  <  t along  i  tlocal— orient  1 1 global— orient  ^ 

with  the  following  interpretation:  It  is  the  largest  group 
of  non-intersecting  line  segments  such  that  for  each  line 
in  the  group,  there  exists  no  line  in  the  group  such  that 
daeross  ^  tacross—low  and  daiong  talong—low  and  there 
exists  at  least  one  line  in  the  group  obeying  the  following 
constraints; 

1.  The  across-the-line  distance  daeross  between  the 
lines  is  such  that  tacross  —  low  ^  daeross  ^  taeross  • 

2.  The  along-the-line  distance  daiong  between  the  lines 
-.  is  such  that  talong—low  ^  daiong  ^  talong- 

3.  The  orientation  difference  between  the  lines  is  no 
more  than  a  threshold  tioeai— orient . 

and  the  entire  group  satisfies  the  assumption  that  the 
maximum  orientation  change  between  any  two  lines  in 
the  group  is  no  more  than  a  threshold  t global- orient- 

If  the  observation  conditions  include  only  pose 
changes,  and  if  all  the  lines  in  a  model  group  are  equi- 
spaced,  then  the  model  group  appesiring  in  the  image 
(the  visible  part  of  it,  that  is)  is  bound  to  be  present  in 
one  such  augmented  groups  because  its  consecutive  lines 
would  exhibit  an  inter-line  spacing  within  the  specified 
range.  In  the  presence  of  occlusions  and  other  observa¬ 
tion  conditions,  and  when  the  inter-line  spacing  in  the 
model  group  is  not  uniform,  the  model  group  can  still 
be  captured  in  the  augmented  groups,  albeit  in  a  frag¬ 
mented  form.  That  is,  the  model  group  may  be  par¬ 
titioned  (and  possibly  merged  with  adjacent  lines)  into 
several  augmented  groups.  But  as  long  as  two  adjacent 
lines  in  the  model  group  are  visible  in  the  image,  they 
can  still  be  captured  in  one  of  the  augmented  groups 


and  this  should,  in  theory,  be  sufficient  for  recognition 
(as  the  two  line  end  points  can  provide  four  features). 

5.3.1  Algorithm  for  generating  augmented  line 
groups 

The  algorithm  for  selectively  generating  the  groups 
satisfying  a  model  description  < 

im— 2d— across  i  tm— 2d— along  >  ^m— local- orient  >  tm— global— orient 

proceeds  by  first  generat- 

ing  the  line  group  <  tm— 2d— across  Pl^tm— 2d— along 

P\^tm— local— orient  ^tm— global— orient  ^  Usiug  the  group¬ 
ing  algorithm  of  Section  3.2.  This  can  capture  model 
groups  that  have  unequal  inter-line  separation  in  the 
case  of  the  model  object  undergoing  pose  change  spec¬ 
ified  by  the  lower  limit  of  tm-2d-aeross  —  Pi-  Then 
augmented  closely-spaced  line  groups  specified  by  < 

^across  across  i  ^  along  *  ^  along  ^  ti^cal- orient  j  t  global— orient  ^ 

are  successively  generated  using  a  modified  version  of 
the  grouping  algorithm  of  Section  3.2  as  follows: 

1.  Let  tgcrojj  ~  tm-2d-across  ~  Pi  ^along  ~ 

tm— 2d— along  Pi- 

2.  For  i  =  1  to  p2  -  pi  do 

•  Let  =  1  and  =  1  + 

along  ‘ 

•  The  grouping  algorithm  of  Section  3.2  is  ap¬ 
plied  to  the  line  segments  with  the  following 
modifications  to  Steps  2a  and  2c: 

-  In  Step  2a,  an  annulus  of  neighborhood 

lying  between  the  rectangles  x 

2(<!,7onj  +  0  and  x  2(4, -t-  /), 

where  I  is  the  length  of  the  line,  is  scanned 
and  all  lines  passing  through  this  annulus 
but  not  through  the  inner  rectangle  are 
retained. 

-  In  Step  2c,  in  addition  to  checking  for 
tm— global— orient  i  oach  enclosing  group  be¬ 
ing  merged  with  the  current  group  is 
checked  for  violations  against  the  annu¬ 
lus  neighborhood  constraint  mentioned 
above. 

5.3.2  Analysis 

The  above  algorithm  makes  (p2  —  pi)  passes  over  the 
line  segments  in  generating  the  matching  groups  for  each 
model  line  group.  However,  the  total  number  of  match¬ 
ing  groups  generated  is  still  lineeir  in  the  number  of  line 
segments  since  each  line  can  belong  to  at  most  (p2  —  pi) 
groups,  one  for  each  line  lying  in  the  annulus  of  neighbor¬ 
hood  between  the  outer  and  inner  rectangles  of  dimen¬ 
sions  2ti,-ross  X  -(-  /)  and  2t{„^„  x  2(<^,^„j  +1). 

The  above  procedure  can  be  repeated  for  generating 
matching  groups  for  each  model  group  separately.  Alter¬ 
natively,  the  allowable  inter-line  spacings  for  all  model 
line  groups  can  be  pooled  together  to  form  ranges  of 
pixel  spacing  for  all  the  model  groups,  and  the  search 
for  matching  groups  can  be  done  for  each  such  range 
using  the  above  algorithm.  The  time  to  generate  the 
augmented  line  groups  for  an  iteration  i  is  still  O(n^) 

(n  is  the  number  of  line  segments)  since  the  grouping 


algorithm  is  the  same  as  before,  and  Step  2c  examines 
each  pair  of  line  segments  at  most  once.  For  the  allowed 
scale  changes,  the  range  (p2  -pi)  is  small  enough  so  that 
the  entire  operation  of  selective  group  generation  can  be 
done  in  0{kn^)  time,  where  I;  <C  n  is  a  constant  repre¬ 
senting  the  number  of  passes  over  the  line  segments. 

Pl,P2  > 

An  Example 

We  now  illustrate  model-driven  selection  using 
parallel-line  groups  with  an  example.  Figure  8  shows 
model  line  groups  being  used  to  perform  selection.  Here 
the  parallel-lines  on  the  ladder  part  of  the  toy  fire  truck 
serve  as  the  model  line  groups  and  are  specified  by  the 
constraints  <  5,0,6,10,3,0  >  implying  that  the  allow¬ 
able  scale  changes  are  from  a  maximum  across-the-line 
spacing  of  (5-3  =)  2  pixels  up  to  5  pixels  (in  other 
words,  allowing  the  object  to  be  imaged  farther  than 
it  is  in  the  model  description).  Here  no  variation  is  al¬ 
lowed  in  the  along-the-line  spacing  as  the  model  lines 
all  overlap  (i.e.  have  tm-2d-aiong  =  0).  The  model-line 
groups  with  the  given  specification  are  shown  in  Fig¬ 
ure  8b.  Figure  8c  shows  (an  edge-image  of)  a  scene  in 
which  the  model  object  appears  at  a  different  orienta¬ 
tion  and  has  a  portion  of  it  occluded.  Figures  8d-g  show 
the  matching  augmented  closely-spaced  line  groups  ob¬ 
tained  using  successively  increasing  line-spacing  as  spec¬ 
ified  by  the  range  <  0, 2, 0, 0, 6, 10  >,  <  2, 3, 0, 0, 6, 10  >, 
<  3, 4, 0, 0, 6, 10  >,  and  <  4, 5, 0, 0, 6, 10  >,  respectively. 
These  matches  to  the  indicated  model  line  groups  are 
shown  collectively  in  Figure  8h.  As  can  be  seen  from  the 
figure  most  of  the  model  line  groups  are  captured  in  the 
matching  groups,  although  there  is  evidence  of  fragmen¬ 
tation  in  two  of  the  groups  marked  1  and  2  as  shown  in 
Figure  8b. 

5.3.3  Discussion 

Model-driven  selection  using  the  above  algorithm  gen¬ 
erates  enough  candidate  match  groups  for  a  model  line 
group  to  avoid  false  negatives  under  most  observation 
conditions.  Further,  by  requiring  the  groups  to  have  a 
minimum  inter-line  spacing,  it  avoids  generating  unnec¬ 
essary  false  positive  matches  to  model  line  groups.  This 
can  easily  happen  if  a  simpler  strategy  for  group  genera¬ 
tion  were  used  such  as  generating  ordinary  (rather  than 
augmented)  closely-spaced  parallel  line  groups  by  suc¬ 
cessively  increasing  the  line  spacing  from  t2d- across  —  Pi 
to  t^d-across  +  P2-  Such  a  Scheme  would  create  suc¬ 
cessively  bigger  groups  (since  a  group  with  small  line 
spacing  would  always  satisfy  the  constraint  of  a  bigger 
line  spacing)  that  are  often  more  unreliable  and  unlikely 
matches  to  model  groups. 

5.4  Search  reduction  using  model-driven  line 
grouping 

The  model-driven  selection  mechanism  using  line  groups 
described  above  identifies  candidate  groups  in  the  image 
that  could  be  potential  matches  for  model  line  groups 
under  some  allowable  transformation  and  taking  into  ac¬ 
count  the  effect  of  occlusions,  illumination  changes,  etc. 
These  matching  model  and  image  line  groups  can  then 


Figure  8:  Illustration  of  line-groups-based  model-driven  selection,  (a)  Edge  image  of  a  model  object  showing  instances 
of  closely-spaced  parallelism  between  lines,  (b)  Some  of  the  line  groups  extracted  using  the  grouping  algorithm  using 
the  constraints  oftaerott  =  5,taiong  =  O,tioeai-orient  =  ^,tgiobai-oTieni  =  10.  (c)  An  edge  image  of  a  scene  in  which 
the  model  object  appears,  (d)  -  (g)  Augmented  closely-spaced  line  groups  generated  using  the  description  of  the  model 
line  groups  shown  in  (b).  (h)  The  line  groups  in  the  image  that  are  possible  matches  to  the  model  line  groups  of  (b) 
under  the  allowed  scale  and  pose  changes. 
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be  given  to  a  recognition  system  that  will  isolate  fea¬ 
tures  from  the  line  groups  to  actually  solve  for  the  pose 
of  the  model  object.  To  see  the  search  reduction  possi¬ 
ble  with  model-driven  selection,  let  Mg  model  line  groups 
be  used  to  perform  model-driven  grouping.  Let  the  total 
number  of  matching  groups  be  Ng  with  )b,-  image  groups 
matching  a  model  group  i.  Letting  {i,j)  represent  a 
match  between  model  group  i  and  image  group  j,  and 
letting  {mi,nj)  stand  for  the  number  of  features  in  the 
matching  groups,  the  number  of  matches  that  may  have 
to  be  tried  to  align  the  model  object  with  the  image 
is  YljLi  To  estimate  the  search  re¬ 

duction  due  to  model-driven  grouping,  we  selected  some 
model  objects  (views  of  them,  that  is)  possessing  closely- 
spaced  parallel  lines  on  the  surface,  generated  and  re¬ 
tained  some  line  groups,  and  recorded  their  specifications 

3^  {i2d— across  i^2d—alongtiloeal—orientiitlokal— orient}-  We 

then  used  scale  bounds  of  si  =  0.5  and  sj  =  2.0  (this 
allows  objects  in  the  scenes  to  be  imaged  at  half  to 
twice  the  distance  at  which  their  model  descriptions  were 
recorded)  to  complete  the  model  line  group  descriptions. 
By  placing  these  objects  in  scenes  containing  clutter, 
and  allowing  partial  occlusions  and  illumination  changes, 
we  ran  the  selective  group  generation  algorithm  of  Sec¬ 
tion  5.3.1  to  record  the  matching  image  line  groups  for 
model  line  groups.  The  end  points  of  line  segments  were 
considered  as  features  to  be  used  for  recognition,  and 
the  number  of  features  in  both  the  model  and  image  line 
groups  were  recorded.  These  experiments  gave  the  val¬ 
ues  for  Mg,Ng,mi,nj,ki  in  the  above  formula.  Table  4 
shows  the  results  of  these  studies,  with  column  10  show¬ 
ing  the  number  of  matches  using  model-driven  group¬ 
ing  evaluated  using  the  above  formula.  The  number  of 
matches  that  would  be  required  without  grouping  is  also 
shown  in  the  table  for  comparison.  As  can  be  seen,  the 
number  of  matches  is  far  less  with  model-driven  group¬ 
ing.  To  get  an  estimate  of  the  actual  search  reduction  as 
well  as  the  number  of  false  positives  and  negatives  due 
to  model-driven  grouping,  however,  the  grouping  mech¬ 
anism  should  be  integrated  with  an  actual  recognition 
system  and  its  performeince  evaluated.  The  results  of 
such  experiments  will  be  discussed  in  a  later  section. 


Even  with  model-driven  grouping,  the  number  of 
matches  when  considered  on  an  absolute  basis,  is  still 
very  large.  This  cem  again  be  attributed  to  the  large 
number  of  matches  ib,-  for  model  line  groups,  and  to  the 
sometimes  leirge  size  of  the  matching  groups.  To  h2in- 
dle  scale  changes,  large  inter-line  spacing  values  have  to 
be  examined  for  group  generation,  unlike  in  the  case  of 
data-driven  selection.  This  may  cause  some  of  the  groups 
to  be  unreliable  or  large-sized  because  of  merging  across 
objects.  Both  these  problems  can  be  alleviated  if  model- 
driven  grouping  is  used  in  conjunction  with  prior  region 
selection  done  using  more  reliable  cues  such  as  color  and 
texture.  One  such  method  of  combining  grouping  with 
prior  selection  is  discussed  in  the  next  section. 


6  Line  Grouping  in  Conjunction  with 
Prior  Region  Selection 

So  far  we  have  examined  grouping  of  line  segments  based 
on  the  constraint  of  closely-spaced  parallelism  as  an  in¬ 
dependent  selection  mechanism.  But  our  original  mo¬ 
tivation  for  grouping  lines  was  to  organize  the  features 
within  prior  selected  color  or  texture  regions  into  small 
groups,  primarily  for  reducing  the  search  in  recognition. 
We  now  explore  the  use  of  line  grouping  within  regions 
that  are  selected  a  priori  based  on  cues  such  as  color  and 
texture. 

Data  or  model-driven  selection  using  line  groups  can 
be  easily  achieved  within  previously  selected  regions 
by  modifying  the  grouping  algorithms  of  Sections  3.2 
and  5.3.1  to  assemble  lines  obeying  an  additional  con¬ 
straint  of  all  lying  within  the  selected  regions.  This  not 
only  restricts  the  number  of  groups  generated  in  the  im¬ 
age  but  also  their  size,  by  preventing  the  lines  belong¬ 
ing  to  adjacent  objects  from  being  merged,  thus  making 
such  groups  more  reliable.  Moreover,  when  model-driven 
grouping  is  done  within  prior  selected  regions,  an  addi- 
tioncil  constraint  is  provided  by  the  enclosing  regions  and 
restricts  the  possible  matches  to  model  line  groups  even 
further.  For  example,  when  the  regions  are  prior  selected 
based  on  model  color  regions,  then  as  sho^n  in  [27],  a 
correspondence  between  model  and  selected  image  color 
regions  is  also  established.  The  search  for  image  line 
groups  matching  a  model  line  group,  therefore,  can  be 
restricted  to  color  regions  in  the  image  that  correspond 
to  the  model  color  regions  spanned  by  the  model  line 
group. 

To  estimate  the  search  reduction  using  line  group¬ 
ing  in  conjunction  with  prior  region  selection,  we  per¬ 
formed  experiments  in  which  model-driven  line  groups 
were  generated  within  model  color  regions.  The  informa¬ 
tion  about  color  regions  spanned  by  a  model  line  group 
was  used  as  an  additional  constraint  in  finding  match¬ 
ing  image  line  groups.  An  example  of  such  restricted 
model-driven  selection  using  line  groups  is  indicated  in 
Figure  9  and  Figure  10.  Figure  9  shows  the  result  of 
color-based  selection  as  described  in  [27].  Figure  9a  and 
b  show  two  views  of  a  model  object  used  to  construct 
its  3-dimensional  description.  Figure  9c  shows  the  re¬ 
gion  adjacecny  graph  describing  the  color  regions  in  the 
model  object  using  the  color  region  segmentation  algo¬ 
rithm  described  in  [28].  The  result  of  color-based  region 
selection  using  the  model  object  description  of  Figure 
9c  in  the  scene  of  Figure  9d  is  shown  in  Figure  9e. 
Next,  Figure  10  shows  the  result  of  line-groups-based 
selection  within  the  prior  selected  regions  in  the  scene  of 
Figure  9d.  The  model  group  specification  is  the  same 
as  in  Figure  8,  namely,  <  5,0,6,10,3,0  >.  Figure  lOd 
shows  the  regions  isolated  in  the  image  using  color-based 
model-driven  selection.  Figure  lOe-h  show  the  match¬ 
ing  line  groups  generated  using  the  augmented  closely- 
spaced  grouping  algorithm  within  the  selected  regions 
of  Figure  lOd.  Finally,  Figure  lOi  shows  all  the  match¬ 
ing  groups  using  the  allowable  transformations  for  the 
line  groups  shown  in  Figure  10b.  By  performing  similar 
experiments  on  a  number  of  model  object  and  scenes. 


S.No. 

M 

N 

Mg 

Avg. 

ki 

Avg. 

rm 

Avg. 

Group 

constraints 

Num. 

Uatches 

No  Selection 

Model-driven 

Selection 

1. 

466 

1768 

9 

150 

6 

4 

<  5,0,6, 10, 3,0  > 

1.88x10-='= 

5.9x10*“ 

2. 

140 

1768 

10 

111 

4 

4 

<  7, 0,6, 10, 5, 2> 

1.49x102° 

6.38x10° 

3. 

140 

2464 

10 

94 

4 

6 

<  7,0,6,10, 5, 1> 

5.6x102° 

2.86x101° 

4. 

358 

1453 

8 

48 

16 

8 

<3,1,6, 10,1,2> 

2.98x1021 

6.65x1012 

5. 

130 

790 

5 

25 

23 

6 

<6,0, 6,10,2, 1> 

4.39xl0i® 

9.00x1011 

Table  4:  Estimated  search  reduction  using  line  groups-based  model-driven  selection.  The  terms  M,N,  Mg,Ng,ki,  mi,nj 
are  as  explained  in  text.  The  allowed  scale  and  pose  changes  in  each  case  are  indicated  in  the  augmented  group 
constraints. 


we  recorded  the  resulting  values  of  Mg,  Ng,  mi,nj,ki,  nj 
(these  terms  were  defined  in  Section  5.4)  and  these  are 
shown  in  Table  5.  The  number  of  matches  using  model- 
driven  grouping  with  prior  region  selection  was  calcu¬ 
lated  using  the  same  formula  that  was  given  in  Sec¬ 
tion  5.4  and  is  shown  in  Column  10  in  Table  5.  This  can 
be  compared  with  the  number  of  matches  using  model- 
driven  line  grouping  without  prior  region  selection  given 
in  Table  4.  As  can  be  seen  from  the  tables,  combining 
line  grouping  with  prior  region  selection  based  on  cues 
such  as  color  can  greatly  reduce  the  estimated  search  in¬ 
volved  in  recognition.  This  is  also  corroborated,  as  we 
will  see  next,  by  experiments  done  with  an  actual  recog¬ 
nition  system. 

Restricting  line  grouping  within  prior  selected  regions 
has  the  disadvantage  though  that  it  relies  on  the  correct¬ 
ness  of  the  prior  selection  mechanism.  This  is  not  always 
the  case.  Color-based  selection  for  example,  does  iden¬ 
tify  a  good  portion  of  the  regions  containing  the  object. 
But  the  region  isolation  is  not  often  very  precise  so  that 
some  spurious  line  segment-containing  groups  may  still 
be  formed  in  such  a  grouping  process. 

7  Actual  Search  Reduction  in  a 
Recognition  System  due  to  Line 
Groups-based  Selection 

Although  the  search  is  greatly  reduced  by  performing 
grouping  within  prior  selected  color  regions,  the  esti¬ 
mated  numbers  are  still  large  («  10^).  A  recognition 
system  that  actually  does  this  amount  of  search  is  far 
from  practiced.  These  numbers  were  eirrived  at  using  a 
worst-case  scenario  in  which  only  two  peiirs  of  matching 
groups  could  be  found  at  the  end  after  searching  through 
the  entire  set  of  possible  matching  pairs.  In  practice,  we 
expect  to  find  lot  of  good  matching  pairs  much  sooner 
in  the  search.  To  test  the  actual  search  reduction  pos¬ 
sible  in  practice,  we  built  a  recognition  system  and  in¬ 
tegrated  the  line  grouping-based  selection  mechanism  to 
record  the  improvement  in  performance.  The  linear  com¬ 
bination  of  views-based  edignment  was  used  for  a  test¬ 
bed  recognition  system  [31].  The  3D  models  were  con¬ 
structed  from  two  2D  views  with  full  correspondence  be¬ 
tween  them  obtedned  using  a  method  described  in  [23]. 
Corner  features  extracted  from  both  the  model  and  im¬ 
age  were  used  to  perform  the  alignment  amd  line  seg¬ 
ment  features  were  used  for  doing  the  verification.  The 


search  for  corresponding  alignment  features  was  done  us¬ 
ing  an  interpretation  tree  type  search  driven  from  the 
image  features  [9].  We  then  used  color-based  selection 
to  isolate  areas  in  the  image  that  are  likely  to  belong 
to  the  object  using  the  method  described  in  [27].  Then 
line  grouping  was  performed  within  the  selected  color 
regions  to  obtain  line  groups  that  match  model  descrip¬ 
tions.  Two  pairs  of  matching  line  groups  were  searched 
and  features  within  the  matching  pairs  were  tried  for 
finding  the  alignment  transform.  Sometimes  three  pairs 
of  matching  line  groups  had  to  be  tried  to  obtain  suf¬ 
ficient  features  for  good  alignment.  Figure  11  shows 
an  excimple  of  recognition  being  performed  with  selec¬ 
tion  based  on  color  and  line  grouping.  The  model  line 
groups  and  the  matching  image  line  groups  are  as  shown 
in  Figure  10.  A  set  of  matching  line  groups  and  the 
corresponding  corner  features  within  them  that  yield  a 
transformation  that  is  verified  by  the  recognition  system 
to  be  correct  are  shown  in  Figure  lie  and  f.  The  pro¬ 
jected  model  overlayed  on  the  original  image  is  shown  in 
Figure  llg.  By  considering  several  (around  600)  ran¬ 
dom  orderings  of  the  list  of  groups  and  features  within 
groups  in  a  number  of  different  scenes,  we  recorded  the 
average  number  of  matches  that  needed  to  be  tried  before 
successful  verification.  Some  of  these  results  are  shown 
in  Table  6.  The  models  eind  scenes  are  the  same  as 
those  used  in  Tables  4  and  5,  but  the  features  here  are 
corners  instead  of  the  end  points  of  line  segments.  The 
number  of  matches  actually  explored  by  the  recognition 
system  for  finding  seven  corresponding  corner  features 
using  line  grouping-based  selection  within  color-selected 
regions  are  indicated  in  Column  10  of  Table  6.  The 
rather  larger  number  of  matches  for  a  smaller  model 
object  in  entry  5  of  the  table  is  due  to  the  larger  size 
of  groups  (the  maximum  size  was  23)  even  though  the 
number  of  model  groups  is  small.  Compared  to  the  num¬ 
ber  of  matches  explored,  detailed  verification  was  done 
for  only  a  few  (about  a  1000)  of  the  matches.  The  esti¬ 
mated  number  of  matches  that  would  be  explored  with¬ 
out  selection  for  seven  corresponding  features  is  shown 
in  Column  9  for  comparison.  From  these  results,  we 
concluded  that  line  grouping-based  selection  when  per¬ 
formed  within  prior  selected  regions  leads  to  a  tremen¬ 
dous  improvement  in  the  performance  of  a  recognition 
system. 
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S.No 
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N 

Mg 

Avg. 

U 

Avg. 

mi 

Avg. 

nj 

Group 

constraints 

Num 

Matches 

JNo  Selection 

With  Prior 
Region  Selection 

1. 

466 

1768 

9 

26 

6 

4 

<  5,0,6,10,3,0  > 

1.88x10-- 

1.77x10“ 

2. 

140 

1768 

10 

20 

4 

4 

<  7,0,6,10,5,2  > 

1.49x10-° 

2.07x10® 

3. 

140 

2464 

10 

19 

4 

4 

<  7,  0, 6, 10,  5, 1  > 

5.6x10-° 

1.87x10® 

4. 

358 

1453 

8  ■ 

17 

16 

6 

<3,1,6,10,1,2  > 

2.98x10-^ 

2.39x10“ 

5. 

130 

790 

5 

13 

23 

4 

<  6,0,6,10,2,1  > 

4.39x10“ 

3.89x10“ 

Table  5:  Estimated  search  reduction  using  restricted  line  groups-based  model-driven  selection  within  prior  selected 
color  regions. 


Figure  9:  Illustration  of  color-based  model-driven  selection,  (a)-  (b)  Two  views  of  a  model  object  used  to  construct  a 
three-dimensional  description,  (c)  A  region  adjacency  graph  description  of  the  color  regions  on  the  model  object,  (d) 
A  scene  in  which  the  object  appears,  (e)  The  result  of  color-based  selection  using  the  model  description  of  (c). 
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Figure  10:  Illustration  of  model-driven  selection  using  line  group  within  prior  selected  color  regions,  (a)  Edge  image 
of  a  model  object  showing  instances  of  closely-spaced  parallelism  between  lines,  (b)  Some  of  the  line  groups  extracted 
using  the  grouping  algorithm  using  the  constraints  oftacrc,  =  ^,tai<mg  =  0,ti„cai-orier,t  =  Q,tgiob<,i-orient  =  10.  (c) 
An  edge  image  of  a  scene  in  which  the  model  object  appears,  (d)  The  region  isolated  using  color-based  model-driven 
selection,  (e)-  (h)  Matching  line  groups  using  successively  increasing  inter-line  spacing  as  described  in  text,  (i)  The 
line  groups  in  the  image  that  are  possible  matches  to  the  model  line  groups  of  (b)  under  the  allowed  scale  and  pose 
changes. 
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Avg. 

mi 

Avg. 

Group 

constraints 

Num.  N 

batches 

Wo  Selection 

Explored  in 
recognition 

160 

EH 

7 

8 

4 

3 

<  5,0,6, 10,  3,0  > 

9.93x10^“ 

3.9x10* 

54 

10 

7 

3 

3 

<  7,0,6,10,5,2  > 

3.77x10^^ 

8.4x10* 

54 

10 

9 

3 

3 

<  7,  0,6, 10,5,1  > 

2.34x10^® 

1.5x10* 

114 

lifl 

8 

11 

4 

4 

<3,1,6,10,1,2> 

2.45x10^® 

9.2x10* 

30 

5 

13 

6 

3 

<6,  0,6,10,2,1  > 

1.13x10*® 

1.35x10® 

Table  6:  Actual  search  reduction  using  restricted  line  groups-based  model-driven  selection  within  prior  selected  color 
regions.  Here  mi  and  Uj  refer  to  number  of  comer  features  (rather  than  the  end  points  of  line  segments)  within  a 
line  group.  Seven  corresponding  comer  features  were  used  for  recognition. 
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Figure  11:  Illustration  of  recognition  in  the  regions  selected  by  the  attentional  selection  mechanism  using  color  and 
line  groups,  (a)  Edge  image  of  a  model  object  showing  instances  of  closely-spaced  parallelism  between  lines,  (b)  Some 
of  the  line  groups  extracted  using  the  grouping  algorithm  using  the  constraints  oft  across  —  ^ytalong  —  Hjtlocal  — orient  — 
Q,t global- orient  =  10.  (c)  An  edge  image  of  a  scene  in  which  the  model  object  appears,  (d)  The  line  groups  in  the 
image  region  selected  by  color  that  are  possible  matches  to  the  model  line  groups  of  (b)  under  the  allowed  scale  and 
pose  changes,  (e)-  (f)  The  pairs  of  matching  model  and  image  line  groups  that  found  a  good  alignment  transform. 
The  circles  here  show  the  matching  comer  features  within  the  line  groups,  (g)  The  model  object  projected  into  the 
image  of  (c)  using  the  alignment  transform  given  by  the  correspondence  shown  in  (e)  and  (f). 
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8  Conclusions 


In  this  chapter  we  have  examined  the  use  of  the  property 
of  closely-spaced  parallelism  between  lines  in  performing 
data  and  model-driven  selection.  Towards  this  end,  a 
scheme  for  grouping  line  segments  was  presented  that 
possessed  several  features  that  make  it  compare  favor¬ 
ably  with  other  existing  schemes  of  grouping  of  edges. 
First,  closely-spaced  parallelism  occurs  commonly  within 
letter  textures  and  contours  of  geometrical  objects.  Sec¬ 
ondly,  the  groups  generated  tend  to  be  compact  and 
more  likely  to  come  from  single  objects  (particularly  in 
the  data-driven  mode).  Also,  the  number  of  such  groups 
is  linear  in  the  number  of  lines,  and  can  be  generated  by  a 
fast  algorithm.  Lastly,  the  size  of  the  groups  tends  to  be 
mostly  small,  except  when  merging  across  objects  occurs 
(which  can  be  reduced  when  grouping  is  restricted  to 
prior  selected  regions).  Thus  grouping  based  on  closely- 
spaced  parallelism  presented  here  satisfies  most  of  the  de¬ 
sirable  requirements  of  grouping  for  recognition.  Lastly, 
unlike  in  existing  approaches  to  grouping,  we  have  also 
excimined  the  use  of  grouping  in  the  model-driven  mode. 
In  doing  so,  an  analysis  of  the  changes  that  can  occur 
to  model  line  groups  due  to  observation  conditions  was 
done,  and  this  was  found  useful  in  performing  model- 
driven  selection  using  such  line  groups.  Finally,  since 
closely-spaced  line  groups  tend  to  span  a  small  portion 
of  an  object,  they  are  good  for  achieving  reliability  in 
selection  but  may  not  be  as  useful  in  actually  solving  for 
the  pose  of  the  object,  as  for  example,  groups  assembled 
using  other  constraints  such  as  convexity. 
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